

## 28.8 A 5.8GHz Power-Harvesting 116μm×116μm "Dielet" Near-Field Radio with On-Chip Coil Antenna

Bo Zhao, Nai-Chung Kuo, Benyuanyi Liu, Yi-An Li, Lorenzo Iotti,  
Ali M. Niknejad

University of California, Berkeley, Berkeley, CA

The proliferation of the Internet of Things (IoT) and low-power sensors would benefit greatly from batteryless compact radios that require no external components. Such a radio could be used for future RFID, wearable/implantable devices, and counter-counterfeit electronics. Previous demonstrations have focused primarily on miniaturizing the radio size [1,2], but relying on off-chip antennas. Two challenges must be overcome to enable sufficiently small radios with on-chip antennas [3]: 1) In most state-of-the-art radios [3-6], the device antenna is connected to both the downlink signal path and the power management unit. Relying on separate signal and power paths incurs an area penalty and degrades antenna matching. In addition, it would be difficult to get a high downlink data-rate since the weak power transfer cannot be disturbed by communication data with a high-index modulation. 2) The ultra-small die area precludes the use of any on-chip energy storage (such as large capacitors or inductors) or a large antenna, so sufficient power must be supplied wirelessly. The wireless power tone acts as a large blocker for the uplink communication signal, resulting in a poor uplink signal-to-noise ratio (SNR) and signal-to-blocker ratio (SBR). In [4], the radio dimension was reduced to 500μm×250μm, but the achieved uplink SNR was only 10dB. To circumvent the blocker issue in the uplink, dual antennas were adopted in [5] and [6] to split the wireless powering tone and uplink tone into two frequencies, resulting in a die area larger than 4.4mm<sup>2</sup>.

We have demonstrated an RFID system used to detect counterfeit electronics that addresses the design challenges in several ways: 1) On the downlink, the power-transfer path is reused for data communication, minimizing chip area. The wireless powering signal is transmitted via ASK with <4% modulation index, minimally impacting the delivered power. Instead of directly connecting a demodulator to the antenna [3-6], the AC-DC rectifier serves as the demodulator followed by an ASK detector, so it does not load the antenna. 2) For the uplink, we propose a hybrid 2<sup>nd</sup>-3<sup>rd</sup>-order intermodulation injection-locking (HIMIL) technique, where a two-tone waveform is generated by a reader for power delivery. A clean uplink carrier is produced by injection-locking the carrier oscillator (CO) on the radio chip to the 2<sup>nd</sup>-order intermodulation (IM2) of the two tones. Meanwhile, utilizing the 3<sup>rd</sup>-order intermodulation (IM3) as the uplink provides isolation from the two-tone blockers. As a result, both the uplink SNR and SBR can be significantly improved.

The system architecture is shown in Fig. 28.8.1, where the chip contains an on-chip coil antenna, an AC-DC rectifier, a power-on reset (PoR), a bandgap, LDOs, memory, and communication blocks. When the rectenna outputs a DC voltage  $V_{DC} > 0.85V$  to power up the chip, the bandgap generates a 0.8V reference voltage and 100nA constant currents  $I_{CONT}$  for all LDOs and submodules. As the ultra-small die area cannot fit sufficient decaps, separated LDOs are adopted to isolate the supply interferences among the submodules. During downlink, the two tones are set to the same frequency  $f_1 = f_2 = 5.74\text{GHz}$  and in phase, but  $f_1$  is modulated by the authentication data. The CMOS radio's on-chip antenna picks up the <4% ASK signal, which is then rectified to produce data with an amplitude less than 4% of the  $V_{DC}$ , requiring amplification by the subsequent ASK detector. After the radio chip authenticates the reader, the flag signal "TXEN" changes from 0 to 1 to initiate uplink.

The reader periodically enters uplink mode to check whether data is coming from the radio chip. Switching from downlink to uplink, the two tones generated by the reader separate to  $f_1 = 5.768\text{GHz}$  and  $f_2 = 5.728\text{GHz}$ . Due to the nonlinearity of the radio chip's AC-DC rectifier, an IM3 tone  $2f_1 - f_2 = 5.808\text{GHz}$  is generated and transmitted to the reader. Meanwhile, the IM2 component  $f_1 - f_2 = 40\text{MHz}$  is passed to the rectifier output  $V_{DC}$ , and then added to the reference current of the CO, as shown in Fig. 28.8.1. The CO is designed to be injection-locked to the 40MHz IM2, resulting in a low-noise strong carrier  $f_{UP} = 20\text{MHz}$ , which improves the uplink SNR. The 4kHz clock generator is used to serially read data from an on-chip 4x4 electron-beam-written (EB-written) memory that modulates the CO. The modulated carrier  $f_{UP}$  is then mixed with the 5.808GHz IM3 tone. As a result, the uplink signal received by the reader is found at  $2f_1 - f_2 + f_{UP} = 5.828\text{GHz}$ , which is far away from the  $f_1$  and  $f_2$  blockers.

Figure 28.8.2 shows some of the key circuit details. The on-chip coil antenna is designed with patterned ground shield, taking a die area of 116μm×116μm, where all the circuits locate inside the coil. The antenna has an inductance of 4.83nH

and a Q-factor of 14, whereas the frequency response of the rectenna indicates that the proposed two-tone technique can be supported with little influence on the power efficiency. The bottom path of the ASK detector ( $M_6$ - $M_7$  and Decap#2) forms an average filter, while the top path ( $M_4$ - $M_5$ ) passes the signal envelope, duplicating  $M_6$ - $M_7$  as  $M_4$ - $M_5$  to address PVT variations. Typically,  $V_{DC}$  varies between 0.85V and 1.7V versus different powering ranges, so an extra diode  $M_3$  is added to support high input voltages. A comparator samples the data by comparing  $V_{ENV}$  and  $V_{AVE}$ , and two subsequent amplifiers increase the data amplitude. The 4kb/s data from the memory turns the CO on and off, whereas the CO is injection-locked to IM2 on the reference current. A Manchester data-rate of up to 5Mb/s must be supported during downlink (TXEN=0); therefore, the 1.6pF Decap#1 is isolated from the  $V_{DC}$  node, and only the 1.2pF Decap#2 affects the signal. During uplink, Decap#1 is turned on to filter the switch-induced ripple at the rectifier output, i.e., the  $V_{DC}$  node.

The testing platform is shown in Fig. 28.8.3. A probe station holds the radio chip in place to test the ranges and misalignment tolerance. For the downlink, the repeated Manchester serial number "00111010" is applied to the mixer to realize the ASK with a modulation index of 3.8%, and the output power of the PA is 32.40dBm, which only decreases the PA peak power by 0.024dB. Port#1 of the duplexer has a passband of 5.725 to 5.770GHz for downlink, whereas Port#3 has a passband of 5.805 to 5.850GHz for uplink. During uplink, the two wireless powering tones 5.768GHz/5.728GHz and the PA output noise at 5.828GHz can be filtered out at port#3, while the IM3 tone  $2f_1 - f_2 = 5.808\text{GHz}$  becomes the dominant blocker. The reader coil dimension is optimized for 1mm powering range. Given the small size of the "dielet", even a 1mm distance represents a 9:1 ratio in range-to-dimension. In addition to the padless radio chip, a testing chip with pads was also implemented to check key signals such as TXEN,  $V_{DC}$ , bandgap reference, and the CO frequency.

The measured wireless power transfer and the downlink communication results are displayed in Fig. 28.8.4. The harvested power and bandgap reference voltage are measured relative to the reader-to-chip range, and the digital authentication succeeds up to 1mm range. The harvested power at 1mm can reach 9.76μW, resulting in a power efficiency of  $5.6 \times 10^{-6}$ . The results also show that the power efficiency drops by less than 40% at a misalignment of 0.2mm.

Figure 28.8.5 shows the uplink spectrum tested at Port#3 of the duplexer. At 0.8mm range, the chip's CO cannot lock to the IM2, so no uplink signal shows up at 5.828GHz. As the range is shortened to 0.7mm, the CO is injection-locked by the 40MHz IM2, and an uplink signal appears at 5.828GHz with an SBR of -28.9dB. Without the two-tone configuration, the CO is free running at 23MHz, and the uplink signal displays an SNR less than 2dB even with downconverting noise cancellation. Using the proposed HIMIL technique, the uplink SNR is improved by 46dB. For the 4kb/s uplink, a 42dB uplink SNR is achieved.

Figure 28.8.6 summarizes this chip performance and compares with previous works. It shows that: 1) The 116μm×116μm radio is the smallest radio, and the die area is less than 11% of the state-of-the-art radios. 2) The chip achieves the best uplink SNR (42dB) by the new uplink concepts. 3) The downlink data-rate is 5x faster than that of previous sub-mm-sized near-field radios. The die micrograph is displayed in Fig. 28.8.7, where all the circuits locate under the patterned ground shield.

### Acknowledgments:

The authors wish to acknowledge the contributions of the students, faculty and sponsors of the Berkeley Wireless Research Center (especially Prof. Borivoje Nikolic, Ajith Amerasekera, Angie Wang, and Andrew Townley), wafer fabrication donation of the TSMC University Shuttle Program, and the support of the DARPA SHIELD program under Grant DARPA-BAA-14-16.

### References:

- [1] S. Pellerano, et al., "A mm-Wave Power-Harvesting RFID Tag in 90 nm CMOS," *IEEE JSSC*, vol. 45, no. 8, pp. 1627-1637, Aug. 2010.
- [2] L. X. Chuo, et al., "A 915MHz Asymmetric Radio Using Q-Enhanced Amplifier for a Fully Integrated 3x3x3mm<sup>3</sup> Wireless Sensor Node with 20m Non-Line-of-Sight Communication," *ISSCC*, pp. 132-133, Feb. 2017.
- [3] M. Usami, "An Ultra Small RFID Chip: μ-Chip," *IEEE RFIC*, pp. 241-244, June 2004.
- [4] W. Biederman, et al., "A Fully-Integrated, Miniaturized (0.125 mm<sup>2</sup>) 10.5 μW Wireless Neural Sensor," *IEEE JSSC*, vol. 48, no. 4, pp. 960-970, April 2013.
- [5] H. Dagan, et al., "A Low-Power Low-Cost 24 GHz RFID Tag With a C-Flash Based Embedded Memory," *IEEE JSSC*, vol. 49, no. 9, pp. 1942-1957, Sept. 2014.
- [6] M. Tabesh, et al., "A Power-Harvesting Pad-Less Millimeter-Sized Radio," *IEEE JSSC*, vol. 50, no. 4, pp. 962-977, April 2015.



Figure 28.8.1: System architecture of "dielet" radio.



Figure 28.8.2: Detailed description of on-chip antenna, downlink ASK detector, and uplink IM2 injection-lock carrier oscillator.



Figure 28.8.3: Measurement setup and measured reader PA power for both one-tone downlink and two-tone uplink.



Figure 28.8.4: Measured results of wireless power transfer and downlink communication.



Figure 28.8.5: Measured uplink SBR and uplink SNR as well as the comparison with conventional backscattering result.

| Radios                   | This Work                    | JSSC'10[1]                         | ISSCC'17[2]                                  | RFIC'04[3]                                   | JSSC'13[4]               | JSSC'14[5]           | JSSC'15[6]                                               |
|--------------------------|------------------------------|------------------------------------|----------------------------------------------|----------------------------------------------|--------------------------|----------------------|----------------------------------------------------------|
| CMOS Process             | 65nm                         | 90nm                               | 180nm                                        | 180nm                                        | 65nm                     | 180nm                | 65nm                                                     |
| Frequency (GHz)          | 5.8 GHz                      | 47 GHz                             | 915 MHz                                      | 2.45 GHz                                     | 1.5 GHz                  | 24 GHz               | DL <sup>(1)</sup> : 24 GHz<br>UL <sup>(1)</sup> : 60 GHz |
| Near-Field or Far-Field? | Near-Field                   | Far-Field                          | Far-Field                                    | Near-Field                                   | Near-Field               | Far-Field            | Far-Field                                                |
| Antenna Type             | On-Chip (Inductive)          | Off-Chip (3D Magnetic)             | On-Chip (Inductive)                          | On-Chip (Inductive)                          | On-Chip (Dipole)         | On-Chip (Dipole)     | On-Chip (Dipole)                                         |
| Off-Chip Components      | NO                           | NO                                 | Battery Res Caps <sup>(2)</sup> Storage Caps | NO                                           | Electrodes               | NO                   | NO                                                       |
| Modulation               | DL: <4% ASK<br>UL: HIMIL     | DL: None<br>UL: PWM <sup>(3)</sup> | PPM <sup>(4)</sup><br>(100% ASK)             | DL: 100% ASK<br>UL: Direct BS <sup>(5)</sup> | Miller (100% ASK)        | 100% ASK             | DL: 75% ASK<br>UL: PPM                                   |
| Data-Rate                | DL: 5 Mbps<br>UL: 4 kbps     | DL: None<br>UL: 5-50 kbps          | DL: 7.8-62.5 kbps<br>UL: 0.03-30.3 kbps      | DL: N/A<br>UL: 12.5 kbps                     | DL: 1 Mbps<br>UL: 1 Mbps | None                 | DL: 6.5 Mbps<br>UL: 12 Mbps                              |
| Uplink SNR               | -42 dB<br>(4kbps data)       | N/A                                | N/A                                          | N/A                                          | 10 dB                    | 40 dB<br>(No data)   | N/A                                                      |
| Uplink SBR               | -28.9 dB @20 MHz             | N/A                                | N/A                                          | N/A                                          | 17 dBm                   | -50 dB @1 kHz        | N/A                                                      |
| Reader Power             | 32.4 dBm                     | N/A                                | N/A                                          | N/A                                          | 17 dBm                   | 10 dBm               | 45 dBm                                                   |
| Range                    | DL: 1mm<br>UL: 0.7mm         | DL: 2.17m<br>UL: 2.17m             | N/A                                          | DL: 1mm<br>UL: 1mm                           | DL: 20cm<br>UL: 20cm     | DL: 50cm<br>UL: 50cm |                                                          |
| Overall Size             | 116um×116um<br>(W/O Antenna) | 1.3mm×0.95mm<br>(W/O Antenna)      | 2.23mm×1.2mm<br>(W/O Antenna)                | 400um×400um                                  | 500um×250um              | 3.74mm×1.86mm        | 3.7mm×1.2mm                                              |

<sup>(1)</sup> DL: Downlink, UL: Uplink      <sup>(2)</sup> Res Caps: Resonance Capacitors  
<sup>(3)</sup> PWM: Pulse-Width Modulation    <sup>(4)</sup> PPM: Pulse-Position Modulation    <sup>(5)</sup> BS: Backscattering

Figure 28.8.6: Performance comparison.



Figure 28.8.7: Die micrograph.

# Session 29 Overview: *Advanced Biomedical Systems*

## IMMD SUBCOMMITTEE



**Session Chair:**  
**Pedram Mohseni**  
*Case Western Reserve University,  
Cleveland, OH*



**Associate Chair:**  
**Nick Van Helleputte**  
*imec, Heverlee, Belgium*

**Subcommittee Chair:** **Makoto Ikeda**, University of Tokyo, Tokyo, Japan

Advances in biomedical circuits and systems are essential technology drivers in addressing critical societal needs to increase the effectiveness and lower the cost of healthcare. This session highlights the latest advances in implantable, high-density, and wearable systems for neural recording, optogenetics, multimodal cell interfacing, and heart-rate monitoring.

### INVITED PAPER

1:30 PM

#### 29.1 Creating Neural "Co-Processors" to Explore Treatments for Neurological Disorders

*Tim Denison*, Technical Fellow, Medtronic Neurological Technology, Fridley, MN

In Paper 29.1, Medtronic introduces the system architecture for a brain coprocessor exploring novel methods for treating neurological disorders.



2:00 PM

#### 29.2 A Fully Immersible Deep-Brain Neural Probe with Modular Architecture and a Delta-Sigma ADC Integrated Under Each Electrode for Parallel Readout of 144 Recording Sites

*D. De Dorigo*, University of Freiburg - IMTEK, Freiburg, Germany

In Paper 29.2, the University of Freiburg – IMTEK present a reconfigurable fully immersible deep-brain neural probe with a modular architecture for parallel readout of 144 recording sites in a 11.5mm needle fabricated in 0.18 $\mu$ m CMOS. Each electrode is equipped with 11b  $\Delta\Sigma$ -ADC in an area of 70 $\times$ 70 $\mu$ m<sup>2</sup> and features low noise of 8.1 $\mu$ V<sub>rms</sub> and 13.4 $\mu$ V<sub>rms</sub> in the frequency bands of the two types of neural signals, local field potentials (1 to 300Hz) and action potentials (0.3 to 10kHz), respectively, and crosstalk of -74.4dB at 1kHz with 39.14 $\mu$ W per recording site.



2:30 PM

#### 29.3 A 16384-Electrode 1024-Channel Multimodal CMOS MEA for High-Throughput Intracellular Action Potential Measurements and Impedance Spectroscopy in Drug-Screening Applications

*C. Mora Lopez*, imec, Heverlee, Belgium

In Paper 29.3, imec presents an active microelectrode array (MEA) with 16,384 electrodes and 1,024 channels for multimodal cell monitoring in 0.13 $\mu$ m CMOS. This MEA integrates 6 cell-interfacing modalities, including intracellular recording and fast impedance monitoring in all of the channels, with a total input-referred noise of 7.5 $\mu$ V<sub>rms</sub> and a total power consumption of 95mW.





3:15 PM

**29.4 A 0.13µm CMOS SoC for Simultaneous Multichannel Optogenetics and Electrophysiological Brain Recording**
*G. Gagnon-Turcotte*, Laval University, Quebec City, Canada

In Paper 29.4, Laval University presents a 0.13µm CMOS IC for simultaneous multichannel optogenetics and electrophysiological brain recording with 10 multimodal recording channels (NEF of 2.30) with ADC, and a 4-channel LED driver circuit. Digitization is done using an in-channel 8.68b ENOB  $\Delta\Sigma$  MASH 1-1-1 ADC with on-chip decimation including AFE, with power consumption of 11.2µW per channel.



3:45 PM

**29.5 A mm-Sized Free-Floating Wirelessly Powered Implantable Optical Stimulating System-on-a-Chip**
*Y. Jia*, Georgia Institute of Technology, Atlanta, GA

In Paper 29.5, Georgia Institute of Technology and Michigan State University present a mm-sized, free-floating, wirelessly powered, implantable, optical stimulating system-on-a-chip (SoC) in 0.35µm CMOS. A 4x4 µLED array is selectively driven with up to 10mA current, and an optimized 3-coil inductive link delivers more than 2mW at 60MHz along with ASK data at 50kb/s to the SoC.



4:15 PM

**29.6 A 92dB Dynamic Range Sub- $\mu$ V<sub>rms</sub>-Noise 0.8µW/ch Neural-Recording ADC Array with Predictive Digital Autoranging**
*C. Kim*, University of California, San Diego, La Jolla, CA

In Paper 29.6, the University of California San Diego presents a 16-channel 92dB input dynamic range, low-noise, low-power neural-recording acquisition system employing predictive digital auto-ranging (PDA) in 0.8V 65nm CMOS. Per-channel ADC with PDA offers a 22dB increase in input dynamic range and 30× improvement in bandwidth and faster than 1ms recovery to 100mV input differential transient artifacts, with noise below 1µV<sub>rms</sub> over DC-to-500Hz bandwidth at 32kHz chopping and 1µA supply current.



4:45 PM

**29.7 A 110dB-CMRR 100dB-PSRR Multi-Channel Neural-Recording Amplifier System Using Differentially Regulated Rejection Ratio Enhancement in 0.18µm CMOS**
*S. Lee*, Daegu Gyeongbuk Institute of Science and Technology, Daegu, Korea

In Paper 29.7, DGIST presents a multichannel neural recording amplifier system with 110dB common-mode and 100dB power supply rejection in 0.18µm CMOS. The neural amplifier system achieves a worst-case total CMRR of 80dB at 1kHz and an electrode impedance of 100kΩ.



5:00 PM

**29.8 A 43.4µW Photoplethysmogram-Based Heart-Rate Sensor Using Heart-Beat-Locked Loop**
*D-H. Jang*, KAIST, Daejeon, Korea

In Paper 29.8, KAIST presents a low-power, photoplethysmography (PPG)-based, heart-rate monitoring sensor based on a heart beat locked loop (HBLL) architecture in 0.18µm CMOS. The sensor realizes an effective duty cycle of 0.0175% with heart rate error of less than 2.1bpm, while consuming 43.4µW.

## 29.1 Creating Neural "Co-processors" to Explore Treatments for Neurological Disorders

Scott Stanslaski<sup>1</sup>, Jeffrey Herron<sup>1</sup>, Elizabeth Fehrmann<sup>1</sup>, Rob Corey<sup>1</sup>, Heather Orser<sup>1</sup>, Enrico Opri<sup>3</sup>, Vaclav Kremen<sup>2</sup>, Ben Brinkmann<sup>2</sup>, Aysegul Gunduz<sup>3</sup>, Kelly Foote<sup>3</sup>, Greg Worrell<sup>2</sup>, Tim Denison<sup>1</sup>

<sup>1</sup>Medtronic Neurological Technology, Fridley, MN

<sup>2</sup>Mayo Clinic, Rochester, MN

<sup>3</sup>University of Florida, Gainesville, FL

While first-generation implantable systems exist today that *modulate* the nervous system, there is a critical need for advancing neurotechnology to better serve patient populations. The convergence of neuroscience and technologies in circuits, algorithms, and energy transfer methods, combined with the growing burden of neurological diseases, make this a timely opportunity. Significantly improving systems arguably requires more than an incremental advancement of "deep brain stimulation;" we propose a fundamental shift in mindset in how engineered bioelectronic systems are interfaced with the body to treat disease.

Bioelectronic medicines and the notion of a "brain coprocessor" provide a framework to replace the historical design notions of neuromodulation [1], such as replacing a surgical lesion with tonic stimulation, with the notion of designing a prosthetic nervous system that is adaptable and programmable for patient needs. A key feature of these designs includes real-time responsiveness to patient symptoms and intentions, informed by design principles at the intersection of biology and engineering.

The potential for neural coprocessors is being explored in several investigational human studies. For example, the University of Washington [2] and the University of Florida (shown in Fig. 29.1.1) are exploring novel cortical-thalamic circuit connections in patients with essential tremor as part of the BRAIN initiative. Essential tremor results in uncontrollable shaking symptoms appearing during attended movement; while therapeutic stimulation might only be required during motion, first-generation systems are on continuously, which consumes excess power and can result in unwanted side-effects. To build a more adaptable therapy, the investigational neural coprocessor detects the natural cortical beta desynchronization correlated with movement intention. When intention is detected, the sub-cortical thalamic stimulation is increased to prevent the onset of tremor; when the motor intention signal disappears (beta resynchronization), the stimulation is ramped back down. This concept is for the coprocessing system illustrated in Fig. 29.1.1, highlighting the adaptive stimulation in the thalamus as a response to cortical control signals in a human subject. Similar novel network controllers have been demonstrated in Parkinson's disease, which are exploring algorithms to account for variations in pharmacological state [3].

Experience with preliminary neural coprocessor algorithms motivates the requirements for the design of next-generation investigational tools, which aim to map integrated circuits into *de novo* neural networks to help restore function. Requirements include embedding physiological sensing instrumentation for better characterizing neural networks, including sub-cortical structures like the basal ganglia [4] as well as the cortex, along with mapping transfer functions between stimulation and biomarkers correlated with clinical outcomes that are required for adaptive therapies [2]. To help gather data, the system must be able to record and telemeter brain signals during activities of daily living, with high throughput and at least arms-length distance. While power consumption is an additional constraint, this requirement is partially mitigated by the use of a rechargeable system; rechargeable implanted devices can support the required sense and transmission functionality while maintaining desired implant longevity.

Figure 29.1.2 captures the key system specifications for the system we designed to support the NIH BRAIN and SPARC initiatives, embodying the key needs for next generation neural coprocessing system suitable for clinical feasibility studies. Mapping these requirements to an integrated circuit system is illustrated in Fig. 29.1.3. The functionality is partitioned into circuit blocks that include the stimulation engine, physiological sensing amplifiers, algorithm processing for recording local passive and evoked field potentials, data loop recording, mechanical lead systems for multi-lead brain network exploration, and distance telemetry.

The foundational circuit design is the stimulation engine, which must maintain core capabilities for established therapies to ensure the expected patient benefit. To facilitate this, the investigational design builds off-of the commercially implantable system for spinal cord stimulation (Intellis™), but enhances the stimulation capabilities by providing the ability to change stimulation patterns. Layered on this stimulation capability is physiology sensing, including recording and processing signals in the presence of stimulation. Sensing during stimulation

for sub-cortical neural circuits requires the ability to resolve 100nV/-/Hz field potential signals in the presence of >1V, ~100µS pulse trains; with stimulation and sensing frequencies that are separated by at least 10Hz. Figure 29.1.4 illustrates the key enhancements to the sensing signal chain that help provide this capability.

There are four key blocks that are relevant to this capability: a synchronous front-end blanking switch, filtering to attenuate the large common-mode and differential-mode stimulation artifacts that tend to occur outside of the local field potential (LFP) bands of interest (approximately 0.5 to 400Hz), a fully differential amplifier design to further improve common-mode rejection of stimulation artifacts, and design techniques that remove the higher-order harmonics of stimulation in the digitization stage to allow sampling at lower frequencies and avoid harmonic folding into the bands of interest. The sampled output of the sense chain may be used for assessment of a biomarker in the time domain, or a frequency analysis of the signal may be performed with a customized Fast Fourier transform (FFT) hardware subroutine that processes the signal, filters it, and sums the power-in-band.

The processing of signals to close the loop can be achieved through multiple approaches. Signals may be streamed to an external computer for additional analysis of biomarkers using a computer in the loop. The computer may use this information to provide meaningful feedback to the system in the form of change to stimulation parameters. Communication latency (<100mS round-trip) were optimized using the existing clinical system to allow external algorithms to be assessed prior to enabling them chronically in the embedded system. Once an external algorithm has been evaluated, the embedded firmware can be upgraded through telemetry with successful new algorithms to provide patient benefit chronically. Additional support for the power and telemetry functionality is enabled through a rechargeable system that minimizes patient burden by fully recharging devices in <1hour, and using a MICS-band radio that can stream the four sensing channels at 1kHz sampling rate continuously [7].

As design and development of neural coprocessors occurs, rigorous testing is required to obtain regulatory approval. This begins with testing of the integrated circuits and the full chip stack shown in Fig. 29.1.3, and continues with system evaluation of electrical performance against ISO standards to show safety of the system to environmental aggressors such as external defibrillation events, electro-cautery use, and exposure to theft detection systems. Final system includes integration with leads and interconnections to complete the prosthetic linkages (Fig. 29.1.5). Using the prototype system, animal testing was conducted in an ovine [5] as well as in a canine epilepsy model to generate a combined 15 years of LFP recording data during the development process (Fig. 29.1.6). Executing the *in vivo* protocols serve as a key validation step to ensure the capability of the system within the intended use cases like epilepsy seizure prediction research [7].

This work has introduced a system architecture for building a brain coprocessor designed for exploring novel methods for treating neurological disorders. The investigational device is currently in deployment for the BRAIN initiative exploring epilepsy, Parkinson's disease, essential tremor, and depression. Insights gained from these studies will help guide the design of future medical devices.

### References:

- [1] E. Boyden, "Brain Coprocessors: The Need for Operating Systems to Help Brains and Machines Work Together," *Intelligent Machines*, Sept. 22, 2010.
- [2] J. Herron, et al., "Cortical Brain Computer Interface for Closed-Loop Deep Brain Stimulation," *IEEE Trans. on Neural Systems and Rehabilitation Engineering*, vol. 25, no. 11, pp. 2180-2187, 2017.
- [3] P. Khanna, et al., "Neurofeedback Control in Parkinsonian Patients Using Electrocorticography Signals Accessed Wirelessly with a Chronic, Fully Implanted Device," *IEEE Trans. on Neural Systems and Rehabilitation Engineering*, vol. 25, no. 10, pp. 1715-1724, 2017.
- [4] M. Rivlin-Etzion, et al., "Basal Ganglia Oscillations and Pathophysiology of Movement Disorders," *Current Opinion in Neurobiology*, vol. 16, pp. 629-637, 2006.
- [5] P. Stypulkowski, et al., "Modulation of Hippocampal Activity with Fornix Deep Brain Stimulation," *Brain Stimulation*, pp. 1-8, 2017.
- [6] D. Bourget, et al., "An Implantable, Rechargeable Neuromodulation Research Tool Using a Distributed Interface and Algorithm Architecture," *Neural Engineering Conference*, 2015.
- [7] V. Kremen, et al., "Continuous Active Probing and Modulation of Neural Networks with a Wireless Implantable System," *Proc. IEEE BioCAS*, 2017 (in press).
- [8] E. Opri, et al., "Towards Adaptive Cortico-Thalamic Closed-Loop Deep Brain Stimulation for the Treatment of Essential Tremor," *Neuroscience Annual Meeting*, 389.17/Y17, 2017.



**Figure 29.1.1:** Fully embedded cortical sensing movement-related beta desynchronization being utilized to control sub-cortical thalamic stimulation in an essential tremor subject. Note that as the patient moves (highlighted in green), the low-frequency signals desynchronize around 20Hz, and the stimulation amplitude increases. Courtesy of University of Florida Opri, Okun, Foote and Gunduz [8].

| LFP/ECOG Sensing                            |                                                                    | Embedded Algorithm Characteristics   |                                                   |
|---------------------------------------------|--------------------------------------------------------------------|--------------------------------------|---------------------------------------------------|
| Operating Power Dissipation (Time Domain)   | 5W/channel                                                         | Algorithm Power                      | <5W/channel (embedded)                            |
| Operating Power Dissipation (Spectral Mode) | 500mW/channel                                                      | Algorithm Type (Embedded)            | Support Vector Machines, State Machines, etc      |
| Typical Function modes                      | Time domain/Fourier Transforms in DSP                              | Algorithm Upgrade Capability         | In-vivo through telemetry and embedded bootloader |
| MUX, channels available                     | Input mux allows 16 → 4 down-selection of best channels for upload |                                      | <b>Memory Buffer (Monitoring Diagnostics)</b>     |
| PC Dual Lead Implant System Assumed         |                                                                    |                                      |                                                   |
| Minimal Detectable Signal                   | <200nVrms                                                          | SRAM                                 | 250kb                                             |
| Spot Noise Spectral Density                 | <150 nV/ $\sqrt{\text{Hz}}$                                        | <b>Stimulation Capability</b>        |                                                   |
| Bandpower Center Frequency                  | dc to 500Hz                                                        | Stimulation Channels                 | 8 for bilateral (4/lead) (unipolar/bipolar)       |
| Bandwidth of Spectral Estimate              | 1-20Hz (FFT determined)                                            | <b>Inertial Sensor</b>               |                                                   |
| CMRR/P/SRR                                  | >80dB                                                              | Operating Power (3-axis Measurement) | 2W                                                |
| High Pass Corners                           | 0.05-8Hz                                                           | Inertial Algorithm Power Dissipation | 25W (posture, activity, tremors, etc)             |
| Input Range (Stim compliance)               | >+/-10V                                                            | Sensitivity                          | 125mV/g (0.1gLSB)                                 |
| <b>Telemetry and Recharge Intervals</b>     |                                                                    | Dynamic Range                        | +/-5g (Falls, footsteps, high impact activity)    |
| Data Rate                                   | 195kbits/s                                                         | Noise (X,Y axis)                     | 3.5 mgRMS (0.1-10Hz)                              |
| Data Capacity                               | 4 channels preprocessed                                            | Noise (Z axis)                       | 5 mgRMS (0.1-10Hz)                                |
| Data Streaming capability                   | 4 Time Domain at 250 or 500Hz, 2 Time Domain at 1kHz               | Nonlinearity and Sensing Floor       | <1%, 10mg any axis                                |
| Recharge Interval (100%)                    | >24hrs                                                             | Shock Survival                       | >10,000g                                          |

**Figure 29.1.2:** Neural coprocessor data sheet [6].



**Figure 29.1.3:** Neural coprocessor integrated circuit die stack. Each modular function uses an optimal circuit technology; for example, 0.25μm HV for sensing and stimulation, and 90nm for micro processing and digital signal processing.



**Figure 29.1.4:** Architectural design of a full-duplex sensing-and-stimulating neural system.



**Figure 29.1.5:** Integration of the integrated circuit stack into a final implantable pulse generator, which is then fully implanted with lead systems that interface with the nervous system to form the final neural coprocessing unit.



**Figure 29.1.6:** Illustration of brain coprocessor events from chronic brain recordings *in vivo*; events are collected using the on-chip detection algorithms to trigger the detector when seizure-like activity is seen. Courtesy of Dr. Gregory Worrell at Mayo Clinic [7].

## 29.2 A Fully Immersible Deep-Brain Neural Probe with Modular Architecture and a Delta-Sigma ADC Integrated Under Each Electrode for Parallel Readout of 144 Recording Sites

Daniel De Dorigo<sup>1</sup>, Christian Moranz<sup>1</sup>, Hagen Graf<sup>1</sup>, Maximilian Marx<sup>1</sup>, Boyu Shui<sup>1</sup>, Matthias Kuhl<sup>1</sup>, Yiannos Manoli<sup>1,2</sup>

<sup>1</sup>University of Freiburg - IMTEK, Freiburg, Germany

<sup>2</sup>Hahn-Schickard, Villingen-Schwenningen, Germany

The evolution of tissue-penetrating probes for high-density deep-brain recording of *in vivo* neural activity is limited by the level of electronic integration on the probe shaft. As the number of electrodes increases, conventional devices need either a large number of interconnects at the base of the probe or allow only a reduced number of electrodes to be read out simultaneously [1,2]. Active probes are used to improve the signal quality and reduce parasitic effects *in situ*, but still need to route these signals from the electrodes to a base where the readout electronics is located on a large area [3,4]. In this work, we present a modular and scalable architecture of a needle probe, which, instead of routing or prebuffering noise-sensitive analog signals along the shaft, integrates analog-to-digital conversion under each electrode in an area of  $70 \times 70 \mu\text{m}^2$ . The design eliminates the need for any additional readout circuitry at the top of the probe and connects with a digital 4-wire interface. The presented reconfigurable 11.5mm probe features a constant width of  $70 \mu\text{m}$  and thickness of  $50 \mu\text{m}$  from top to bottom for minimal tissue damage with 144 integrated recording sites and can be fully immersed in tissue for deep-brain recording applications.

The probe in Fig. 29.2.1 consists of a short base, a chain of modular recording sites, and a tip. The whole probe is separated along its length into a digital and an analog part with separate supply routing and low-impedance ground shield in between, that also covers the top to increase the robustness against EMI and to reduce digital noise coupling. The base includes a reference transistor providing the global voltage biasing  $V_{\text{BIAS}}$  to all recording sites and a FSM that forwards the internal data and configuration chains to the external unit. In the analog part, there are only two global reference lines throughout the probe, i.e., the body reference voltage  $V_{\text{BODY}}$  and  $V_{\text{BIAS}}$ . The bias voltage (referenced to  $V_{\text{DD}}$ ) is routed with large parasitic capacitances to the supply to enhance noise rejection from external sources. On the digital side, no global signals are routed: the chain signal as well as the clock are forwarded from one block to the next one. The clock is slightly delayed from site to site to spread digital supply noise and reduce peak current consumption. The recording sites are grouped into blocks of two ADCs, one connected to the forward and one to the backward chain and clock.

Since even the largest neuronal signals are only in the range of some tens of millivolts and the required linearity is low, a direct conversion using a gm-C based incremental  $\Delta\Sigma$ -ADC under each electrode is implemented. The first-order modulator shown in Fig. 29.2.2 allows the implementation on a minimal silicon area, since only one integrator and capacitor and no accurate time constants, thus no local biasing, are needed. Decimation is accomplished using a simple ripple counter. The output of the single branch OTA-C integrator is connected to the quantizer, i.e., comparator and output latch, driving the switches for the current feedback.

The 11b ADCs are designed to optimize noise performance per area, therefore as much area as possible is dedicated to the noise critical components, i.e., the input ( $171 \mu\text{m}^2$ ) and the load transistors ( $144 \mu\text{m}^2$ ). Only a small area is dedicated to the feedback current sinks, which are derived from the global bias line. The current determines the full-scale (FS) of the ADC which can be configured to  $\pm 11.25\text{mV}$ ,  $\pm 22.5\text{mV}$  or  $\pm 45\text{mV}$ . Depending on the comparator output, the current is injected either to the left or the right low-impedance cascode node of the OTA. Common-mode ripple caused by the asymmetrical feedback is reduced by connecting the  $95\text{fF}$  MIM integration capacitances ( $7 \times 3.5 \mu\text{m}^2$ ) to  $V_{\text{CMFB}}$  and rejected by the differential comparator input. The noise of the feedback current source and the feedback switches, which operate at digital-level input signals, is negligible compared to the major noise contributors. The constraints on the area and accuracy of the source-degenerated DDA-CMFB are not stringent, first because noise is canceled by the differential nature of the circuit, and second, because no exact common mode is needed at the comparator input. The transconductance of the differential pair is determined by thermal noise considerations to be  $4.2 \mu\text{S}$ . A measured maximal SNR of  $65.6\text{dB}$  ( $\text{FS} = \pm 45\text{mV}$ ) and a THD of  $0.22\%$  at  $V_{\text{PP}} = 10\text{mV}$  ( $\text{FS} = \pm 11.25\text{mV}$ ) is obtained for a tail current of  $1.5\mu\text{A}$ .

The digital part of the ADC consists of a decimator, i.e., ripple counter, two registers for the 11b conversion result and a 2b configuration register. The ADC runs for 1024 cycles delivering a 10b result. Before resetting the OTA output and the counter, the last result of the comparator, which represents the final conversion error is appended as the eleventh bit and put on the data chain. The delayed clock of a following cell is used for the latch to avoid timing violations between the latch and the comparator. During readout, the digital data is shifted through the recording sites and the presented device uses two separate chains, each of them with a bit rate of  $20.48\text{Mb/s}$ , i.e.,  $f_s = 20.48\text{MHz}$ . The FSM in the base combines the outputs of both chains into a single data stream by time multiplexing which yields  $40.96\text{Mb/s}$  at the front-end. The base consumes  $37\mu\text{W}$  and the power consumption per recording site results to  $39.14\mu\text{W}$ , of which  $12.77\mu\text{W}$  are consumed by the analog part.

The ADC covers an input signal bandwidth of  $10\text{kHz}$  with a flicker noise corner between  $240$  and  $590\text{Hz}$ , depending on the FS mode. The noise in the frequency bands of the two types of neural signals, i.e., local-field-potentials (LFP, 1 to  $300\text{Hz}$ ) and action potentials (AP, 0.3 to  $10\text{kHz}$ ) are  $8.1\mu\text{V}_{\text{rms}}$  and  $13.4\mu\text{V}_{\text{rms}}$ , respectively ( $\text{FS} = \pm 11.25\text{mV}$ ) and are shown in detail in Fig. 29.2.3. All measurements are taken *in vitro*, i.e., include also noise resulting from the electrodes and the electrolyte surface interface, and without any additional shielding.

Figure 29.2.4 shows the *in vitro* setup. The averaged data of all ADCs is used to drive a proper body voltage for the solution and for cancellation of artifact signals. The measured signal is separated into LFP and AP by digital post-processing. The shielding and layout concept with both input transistors of the differential pair placed under the electrode suppresses illumination artifacts, which is a strong requirement for optogenetic stimulation of neuronal cells [5]. Sensitivity measurements against pulsing broadband light sources are shown in Fig. 29.2.5. The resulting signal shifts during light excitation are consistent with photonic effects on the Pt-electrode surface while the CMOS circuit underneath does not further degrade the performance.

Figure 29.2.6 compares the system to the current state of the art. The micrographs in Fig. 29.2.7 show the fully implantable probe with a constant width of  $70 \mu\text{m}$  and thickness of  $50 \mu\text{m}$  from tip to base. The length of the base is independent on the number of electrodes. The maximal number of recording sites is solely limited by the datarate of the chain as the clock frequency equals to  $f_s$  of the ADCs. Each ADC delivers  $20\text{KS/s}$ , limiting the length to 93 electrodes per chain. The presented probe employs two datachains; however, an extension with multiple chains would only add marginal complexity to the digital part. Since no global analog neural-signal routing is present and due to the high modularity of the design, a longer probe or any application-specific modification of the probe geometry would deliver identical performance. Technology scaling would considerably reduce the power dissipation as well as the probe width, since half the probe area is dedicated to the digital circuitry.

### Acknowledgments:

This research was supported by the *Fritz Huettinger Foundation* as well as by the *Cluster of Excellence BrainLinks-BrainTools* funded by the German Research Foundation (DFG, grant number EXC1086). The authors thank A. Sayed Herbawi, Dr. P. Ruther and Prof. Oliver Paul for the post-CMOS fabrication carried out in collaboration with the Microsystem Materials Laboratory at IMTEK, University of Freiburg, Germany.

### References:

- [1] J. Scholvin, et al., "Close-Packed Silicon Microelectrodes for Scalable Spatially Oversampled Neural Recording," *IEEE Trans. Biomed. Eng.*, pp. 120-130, 2016.
- [2] A. S. Herbawi, et al., "High-Density CMOS Neural Probe Implementing a Hierarchical Addressing Scheme for 1600 Recording Sites and 32 Output Channels," *IEEE Transducers*, pp. 20-23, 2017.
- [3] B.C. Raducanu, et al., "Time Multiplexed Active Neural Probe with 678 Parallel Recording Sites," *IEEE ESSDERC*, pp. 385-388, 2016.
- [4] C.M. Lopez, et al., "A 966-Electrode Neural Probe with 384 Configurable Channels in  $0.13\mu\text{m}$  SOI CMOS," *ISSCC Dig. Tech. Papers*, pp. 392-393, 2016.
- [5] T.D.Y. Kozai and A.L. Vazquez, "Photoelectric Artefact from Optogenetics and Imaging on Microelectrodes and Bioelectronics: New Challenges and Opportunities," *J. Mater. Chem. B*, pp. 4965-4978, 2015.



Figure 29.2.1: System-level schematic with 3D view of the neural probe.

Figure 29.2.2: Incremental  $\Delta\Sigma$  system-level schematic and transistor-level implementation (the feedback current sources of only one full-scale mode are shown:  $I_{FBN}$ ,  $I_{FBP}$ ).Figure 29.2.3: Measured DNL/INL for  $FS=\pm 45\text{mV}$  as well as *in vitro* power spectral density plot and statistical noise distribution (384 recording sites - multiple probes).Figure 29.2.4: *In vitro* measurement setup showing DC controller / artifact filter and digital post-processing. Measurement results from stimulation with prerecorded data (hippocampus). Photo shows *in vitro* MEA adapter with two needle probes for brain slice activity recording.

Figure 29.2.5: Photometric and radiometric light sensitivity measurement (average noise of all illuminated recording sites) for optogenetic applications. For comparison: an illuminance of ~500 lux corresponds to typical office lighting, ~10000 lux to full daylight.



|                                       | [1]                                            | [2]      | [3]        | [4]           | This Work                     |                |                 |
|---------------------------------------|------------------------------------------------|----------|------------|---------------|-------------------------------|----------------|-----------------|
|                                       | fullscale mode (mV)<br>± 11.25   ± 22.5   ± 45 |          |            |               | Active                        | 144            | 144 (all sites) |
| Type                                  | Passive                                        | Passive  | Active     | Active        | 144                           |                |                 |
| # of electrodes                       | 5x200                                          | 1600     | 1356       | 384/966       |                               |                |                 |
| # of channels (simultaneous readout)  | 1000                                           | 32       | 768/1356** | 384           | 706.8 (r = 15 $\mu\text{m}$ ) | 70             |                 |
| Electrode Area ( $\mu\text{m}^2$ )    | 9 x 9                                          | 17 x 17  | 20 x 20    | 12 x 12       |                               | 0.07 x 10.5    |                 |
| Electrode Pitch ( $\mu\text{m}$ )     | 11                                             | 20       | 22.5       | 20            |                               | 0.07 x 10.5    |                 |
| Shank size ( $\text{mm}^2$ )          | 0.075 $\times$ 7.5                             | 0.1 x 10 | 0.1 x 8    | 0.07 x 10     |                               | 0.07 x 10.5    |                 |
| Base area ( $\text{mm}^2$ )           | 66.75*                                         | 0.9      | 160.65     | 45.23         |                               | 91.3%          |                 |
| Shank Total Area Ratio                | 0.84%                                          | 52.63%   | 0.50%      | 1.52%         |                               |                |                 |
| Base Area / Channel ( $\text{mm}^2$ ) | 0.067                                          | 0.028    | 0.12       | 0.12          | 0.00049                       |                |                 |
| ADC Resolution (bit)                  | -                                              | -        | 10         | 10            | 11                            |                |                 |
| Architecture                          | -                                              | -        | SAR        | SAR           | OTA-C Inc. $\Delta\Sigma$     |                |                 |
| Noise ( $\mu\text{VRMS}$ )            |                                                |          |            |               |                               |                |                 |
| LFP: 0.5Hz-1kHz                       | -                                              | -        | 50.2       | 10.32         | 9.95                          | 10.35          | 11.39           |
| LFP: 1Hz-300Hz                        | -                                              | -        | -          | -             | 8.11                          | 8.19           | 8.60            |
| AP: 300Hz-10kHz                       | -                                              | -        | 12.4       | 6.36          | 13.43                         | 15.25          | 19.91           |
| Power/Channel ( $\mu\text{W}$ )       | -                                              | -        | 45         | 49.06         | 39.14                         | 43.32          | 46.29           |
| THD                                   | -                                              | -        | n/a        | 0.4% @ 10mVpp | 0.22% @ 10mVpp                | 0.48% @ 20mVpp | 0.75% @ 40mVpp  |
| Crosstalk @ 1kHz (dB)                 | -                                              | -        | -63        | -64.4         | -74.7                         |                |                 |
| Supply Voltage (V)                    | -                                              | 1.8      | 1.2        | 1.2/1.8       |                               | 1.8            |                 |
| Technology                            | SOI/EBL                                        | CMOS     | CMOS       | CMOS          | 180nm CMOS                    |                |                 |

\*estimated, \*\*intermittent/higher noise, \*\*\*electronic part (w/o pads)

Figure 29.2.6: Comparison with state-of-the-art designs.



## 29.3 A 16384-Electrode 1024-Channel Multimodal CMOS MEA for High-Throughput Intracellular Action Potential Measurements and Impedance Spectroscopy in Drug-Screening Applications

Carolina Mora Lopez<sup>1</sup>, Ho Sung Chun<sup>1</sup>, Laurent Berti<sup>2</sup>, Shiwei Wang<sup>1</sup>, Jan Putzeys<sup>1</sup>, Carl Van Den Bulcke<sup>1,3</sup>, Jan-Willem Weijers<sup>1</sup>, Andrea Firrincieli<sup>1</sup>, Veerle Reumers<sup>1</sup>, Dries Braeken<sup>1</sup>, Nick Van Helleputte<sup>1</sup>

<sup>1</sup>imec, Heverlee, Belgium

<sup>2</sup>Chrysalite, Tervuren, Belgium

<sup>3</sup>KU Leuven, Leuven, Belgium

Patch clamp is currently the gold standard for studying cell electrophysiology in preclinical drug discovery. Since it is a time-consuming technique, passive and active multielectrode arrays (MEAs) have been introduced for increased throughput in extracellular (ExC) *in vitro* measurements [1-3]. However, most of these tools cannot capture the essential features of intracellular (InC) action potentials (APs) used for studying drug toxicity. It has been shown that InC access can be achieved by highly localized electroporation [4], which allows low-impedance electrical recording of the InC voltage. While this technique was explored in recent designs [4-5], a tool that enables high-throughput InC measurements is not yet available. Impedance measurement is also used to study cell morphology, adhesion, differentiation and contractility [6]. Recent designs [1-3] already include impedance spectroscopy (IS), but their methods are not fast enough to capture in detail the contractile activity of cardiomyocytes. Since some drugs can inhibit cell beating without affecting the APs [6], ExC/InC recording and impedance measurement are complementary.

We present a CMOS MEA with 16384 electrodes and 1024 channels designed for multimodal cell electrophysiology (Fig. 29.3.1). The different cell-interfacing modalities are: 1) ExC recording (in 1024 simultaneous sites); 2) InC recording (1024 sites); 3) constant voltage stimulation (CVS) for controlled electrode potential (4096 sites); 4) constant current stimulation (CCS) for controlled charge delivery (64 sites); 5) fast impedance monitoring (IM) at a fixed frequency (1024 sites); and 6) IS in a frequency range of 10Hz to 1MHz (64 sites). CVS and CCS can be used for 2 purposes: influence cell behavior or induce cell electroporation before InC recording. While IM can detect impedance changes over time (e.g., cell contractility), single-cell IS gives detailed information of the electrode impedance, seal resistance and cell-membrane impedance (e.g., cell differentiation). By enabling multi-well assays in a single chip, the 6 modalities can be used independently and simultaneously in 16 different cell assays, increasing considerably the throughput of current single-well CMOS MEAs [1-5].

Figure 29.3.2 shows the architecture of the CMOS MEA. The active sensor area consists of an array of 16384 electrodes connected to 4096 pixels. Each pixel has an AC-coupled source follower, switches for electrode selection and connection to the stimulation/impedance circuits, and a 4b local memory. The pixels connect to 1024 recording channels and 64 stimulation units (SUs). Each channel has 4 independent modes (ExC/InC recording, IM and IS) all using the same reconfigurable instrumentation amplifier (IA). The channel gain/bandwidth can be programmed independently in each mode. The channel outputs are time-multiplexed and digitized with 64 10b SAR ADCs, and the data is transmitted through a serial interface (168Mb/s).

Each independently-controlled SU has a 5b voltage DAC (VDAC) and a 8b current DAC (IDAC), working at 100ks/s. The VDAC is a binary R-2R resistor ladder, buffered by a class-AB amplifier and able to drive the load of 64 parallel electrodes during CVS. The IDAC has two purposes: deliver arbitrary current waveforms during CCS or provide a DC current during IS. Since the impedance to be measured during IS can range from tens of GΩ to hundreds of kΩ (5 orders of magnitude), the IDAC must be able to generate currents from few pA to hundreds of nA. To achieve this, the IDAC is designed using a binary current-steering architecture with independent source and sink outputs (Fig. 29.3.3). Two possible LSB current references (2pA or 500pA) are generated by combining a current splitter based on a R-2R principle, and a current aggregator. Dual regulated cascodes (for the 2 ranges) are used to ensure a high output impedance (>100GΩ). A digital-assisted closed-loop charge-balancing (CB) circuit in each SU prevents residual charge injection into the electrode-cell interface. For this, the positive or negative pulses are extended based on the crossing of predefined electrode voltage thresholds (Fig. 29.3.4 right).

In IS mode, the SUs generate currents of varied frequency and amplitude. Unlike other designs [1-3], the IS circuit uses square-wave current modulation (chopper after the IDAC) instead of sine waves, achieving lower power and area. The IA output demodulator is synced to the input current modulator to convert the impedance signal to baseband, while the ExC/InC signals are upmodulated and filtered out. The phase of the output modulator is programmable allowing to extract the real and imaginary part of the cell impedance in an interleaved manner. Since IS is a slow method, IM at a fixed frequency (1 or 10kHz) is included in each channel to provide fast monitoring of impedance changes with a resolution of 0.1 or 1ms. In this mode, a square-wave current with passive CB is applied to the electrodes bypassing the pixel buffer, and the upmodulated impedance signal is measured as in ExC-recording mode. Figure 29.3.4 (left) shows the channel operation in the different modes.

Figure 29.3.5 shows the measured performance of the various functionalities. Different gains and bandwidths can be configured in the channels depending on the mode. The input-referred noise in ExC recording mode is  $7.5 \pm 0.6 \mu\text{V}_{\text{rms}}$  and  $12 \pm 2.4 \mu\text{V}_{\text{rms}}$  ( $n=1024, 1\sigma$ ) in the AP (0.3 to 10kHz) and full (0.5Hz to 10kHz) bands, respectively, showing uniform performance across all channels. The voltage and current stimulation waveforms demonstrate the capability of the SUs to generate arbitrary stimuli for CVS and CCS. IS of a 1.2pF capacitor confirms that the chip can measure accurately the impedance magnitude and phase up to 1MHz.

To validate the functionality of the MEA in a biological setting, primary cardiomyocytes were cultured on the sensor area. *In vitro* measurements are shown in Fig. 29.3.6. ExC recordings of the 16 wells are shown as a heatmap of the peak-to-peak AP amplitude (top-left). Fluorescence images (Calcein-AM and Hoechst) of one well are compared to the ExC recording (peak-to-peak amplitude) and IM (1kHz, peak-to-peak contractility impedance) heatmaps, demonstrating that we can identify cell location and attachment (bottom-left). After applying CSV, it was possible to electroporate the cells and record the InC signals (bottom-right). By applying a drug (1µM Nifedipine), changes in the InC signal shape were detected, demonstrating the advantages of this technique over the ExC recording. Single-cell IS was also performed to analyze the impedance of the electrode-cell interface over the whole spectrum (top-right).

Figure 29.3.7 shows the chip micrograph and performance comparison with state-of-the-art designs. Based on this comparison, this MEA integrates the highest number of cell-interfacing modalities and is the only one enabling multi-well assays for advanced cell-electrophysiology. The low-noise recording channels have programmable transfer functions with uniform performance across all channels. By achieving the highest number of independent SUs with CB functionality, the chip enables safe simultaneous CVS, CCS and IS at multiple electrodes. Moreover, InC recording and IM are possible in all the channels. In conclusion, this multimodal MEA may open up new opportunities in high-throughput cell-based pharmacological screening as well as fundamental studies of cells, exceeding the capabilities of existing technologies.

### Acknowledgments:

The authors thank S. Mitra for the valuable discussions and guidance, and O. Krylychka and T. Pauwelyn for helping with the cell cultures and *in vitro* tests.

### References:

- [1] T. Chi, et al., "A Multi-Modality CMOS Sensor Array for Cell-Based Assay and Drug Screening," in *IEEE TBioCAS*, vol. 9, no. 6, pp. 801-814, Dec. 2015.
- [2] J. S. Park, et al., "A High-Density CMOS Multi-Modality Joint Sensor/Stimulator Array with 1024 Pixels for Holistic Real-Time Cellular Characterization," *IEEE Symp. VLSI Circuits*, 2 pages, 2016.
- [3] J. Dragas, et al., "In Vitro Multi-Functional Microelectrode Array Featuring 59 760 Electrodes, 2048 Electrophysiology Channels, Stimulation, Impedance Measurement, and Neurotransmitter Detection Channels," *IEEE JSSC*, vol. 52, no. 6, pp. 1576-1590, June 2017.
- [4] Z. C. Lin, et al., "Accurate Nanoelectrode Recording of Human Pluripotent Stem Cell-Derived Cardiomyocytes for Assaying Drugs and Modeling Disease," *Microsystems & Nanoengineering*, 16080, Mar. 2017.
- [5] J. Abbott, et al., "CMOS Nanoelectrode Array for All-Electrical Intracellular Electrophysiological Imaging," *Nature Nanotechnology*, vol. 12, pp. 460-466, Feb. 2017.
- [6] M.F. Peters, et al., "Human Stem Cell-Derived Cardiomyocytes in Cellular Impedance Assays: Bringing Cardiotoxicity Screening to the Front Line," *Cardiovascular Toxicology*, vol. 15, no. 2, pp. 127-139, Apr. 2015.



**Figure 29.3.1:** Concept of the CMOS MEA ASIC in a multi-well assay. A pfluidic interposer isolates the 16 wells in the active sensor area. The top right shows the concept of the InC recording.



**Figure 29.3.2:** High-level system architecture of the multimodal active MEA.



**Figure 29.3.3:** Architecture of the 8b current-steering DAC. Digital-assisted DAC calibration and closed-loop charge balancing functionalities are included.



**Figure 29.3.4:** Spectrum illustration of the 4 operating modes of the channel (left), and concept simulation of the charge-balancing circuit operation in presence of a large current mismatch of 12.5% (right).



**Figure 29.3.5:** Measured electrical performance of the recording (top and right), impedance measurement (bottom left) and stimulation (bottom middle) functionalities.



**Figure 29.3.6:** *In vitro* measurements of primary cardiomyocytes cultured on CMOS MEA.



Figure 29.3.7: Left: die photograph ( $10 \times 19.2\text{mm}^2$ ). Right: performance overview and comparison to prior art.

## 29.4 A 0.13 $\mu$ m CMOS SoC for Simultaneous Multichannel Optogenetics and Electrophysiological Brain Recording

Gabriel Gagnon-Turcotte, Christian Ethier, Yves De Koninck,  
Benoit Gosselin

Laval University, Quebec City, Canada

Optogenetics and multi-unit electrophysiological recording are state-of-the-art approaches in neuroscience to observe neural microcircuits *in vivo* [1]. Thereby, brain-implantable devices incorporating optical stimulation and low-noise data acquisition means have been designed based on custom integrated circuits (IC) to study the brain of small freely behaving laboratory animals. However, no existing IC provides multichannel optogenetic photo-stimulation along with multi-unit electrophysiological recording capability within the same die [2-5]. They also lack critical features: they are not multichannel and/or do not include an ADC [6], or they address only one signal modality [5-6], i.e., either local field potentials (LFP) or action potentials (AP). In this paper, we report an IC for simultaneous multichannel optogenetics and electrophysiological recording addressing both LFP and AP signals all at once. This 0.13 $\mu$ m CMOS chip, which includes 4/10 stimulation/recording channels, is enclosed inside a small wireless optogenetic platform, and is demonstrated with simultaneous *in vivo* optical stimulation and electrophysiological recordings with a virally mediated Channelrhodopsin-2 (ChR2) rat.

Key circuits for electrophysiological recording and optogenetics are depicted in this paper. We present a 3<sup>rd</sup>-order discrete-time (DT)  $\Delta\Sigma$  ADC architecture with on-chip decimation filters (DF). This  $\Delta\Sigma$  MASH 1-1-1 employs a bias duty-cycling strategy that decreases its power consumption and a digital subtraction technique between the  $\Delta\Sigma$  branch outputs that reduces its footprint size. This design offers an alternate solution to the widespread use of SAR ADC by trading analog complexity for digital circuits, and by enabling a tunable precision. In particular, using a 3<sup>rd</sup>-order  $\Delta\Sigma$  allows for an oversampling ratio (OSR) small enough to put the DF on-chip, unlike in previous designs [4-5]. On the same die, optical stimulation is performed by a 4-ch LED driver circuit with feedback to precisely set the effective forward current of each LEDs, which improves previous solutions based on open-loop drivers [2,6].

Figure 29.4.1 shows the schematic of the analog-front-end (AFE) circuit. It consists of a fully differential folded-cascode operational transconductance amplifier (OTA), using source-degeneration resistors to limit the input-referred noise contribution to only two transistors ( $M_3$  and  $M_{11}$ ) [3]. The ac-coupled OTA employs two capacitors and two pseudoresistor banks as feedback elements, which provides a 3b tunable high-pass filter cutoff frequency. This allows the selection of a suitable bandwidth for either the LFP or AP signal, while the low-pass analog cutoff frequency is fixed by capacitor  $C_5$ .

Figure 29.4.2 shows the block diagram of the  $\Delta\Sigma$  MASH 1-1-1 neural ADC. The 1<sup>st</sup> stage consists of a DT  $\Delta\Sigma$  modulator fed by the conditioned output signal of the AFE, while the 2<sup>nd</sup> and 3<sup>rd</sup> stages are two DT  $\Delta\Sigma$  modulators fed by the DAC value ( $DAC_{x-1}(z)$ ) minus the integrator output signal ( $I_{x-1}(z)$ ) from their respective previous stages. As a result, each stage cancels out the quantization noise induced by its previous stage. While switched-capacitor (SC) circuits are typically used to perform the subtraction between the DAC value and the integrator output [4], the reported topology improves previous solutions by performing this subtraction digitally instead, which simplifies the circuit implementation: instead of performing  $DAC_{x-1}(z) - I_{x-1}(z)$  using SC,  $DAC_x(z) - DAC_{x-1}(z)$  is performed using the output bits of the current and previous D-Latches by the DAC with pre-subtraction module (see Fig. 29.4.2). The result of  $DAC_x(z) - DAC_{x-1}(z)$  can take only three possible values: VSS, VDD or  $V_{cm}$  that are selected digitally using analog switches, which holds for the following special case used in this design:  $DAC_{ref1} = \frac{3}{4}VDD$  and  $DAC_{ref2} = \frac{1}{4}VDD$ . The reduced OSR needed by the  $\Delta\Sigma$  MASH 1-1-1, compared with 1<sup>st</sup>- or 2<sup>nd</sup>-order  $\Delta\Sigma$ , allows the OTAs in each of the  $\Delta\Sigma$  branches to remain idle for long periods of time. Therefore, a *bias duty cycling* technique is used to turn off the OTA bias current between each sampling cycle, allowing a decrease in power consumption of 41.3 %, for a bias duty cycle of 35%, without noticeable impact on the precision.

Figure 29.4.3 shows the block diagram of the 4<sup>th</sup>-order cascaded integrator-comb (CIC4) DF. Unlike [4-5], this  $\Delta\Sigma$  with on-chip DF provides tunable precision and adjustable bandwidth by varying the OSR and the *clock reduction factor* M (see

Fig. 29.4.3) depending on the AFE performance and type of signal to record (i.e., LFP or AP). This filter uses a highly optimized CMOS implementation strictly based on adders and registers which is avoiding power-hungry multipliers and dividers. To use as few output bits per stage as possible without degrading the precision, the minimal binary precision of each stage is determined using the Hogenauer pruning technique, so as to accommodate OSR  $\leq 50$ , i.e., DF resolution  $\leq 14b$ . The DF is designed to provide an attenuation of 15dB at the Nyquist frequency, by setting M to 25.

The schematic of the 4-ch LED driver circuit is presented in Fig. 29.4.4. Compared to previous designs [2,6], this circuit allows to precisely control the value of the forward current into each LED using a regulated cascode current source with feedback. The current can be adjusted independently of the LED parameters (forward voltage, etc.) and the battery voltage, guaranteeing that the right optical power ( $>0.1mW/mm^2$ ) is delivered by each LED to properly activate ChR2 neurons [1]. In Fig. 29.4.4, the wide-swing current mirror keeps the drain-source voltage of  $M_2$  at the edge of saturation ( $V_{ds2} \approx V_{eff}$ ), minimizing the voltage drop inside the chip to accommodate smaller battery voltages, and avoiding the need for small current sensing resistors with poor process accuracy and inter-channel matching. The LED driver is able to deliver constant forward currents into different LEDs typically used for optogenetics, with  $V_{forward}$  varying from 2.9 to 3.36V (see Fig. 29.4.4).

The input-referred noise of the AFE is of 3.2 $\mu$ V<sub>rms</sub> over a 7kHz BW and its THD is of 0.8% with a 1kHz/2.5mV<sub>p-p</sub> signal. Using the same input signal, the AFE in series with the  $\Delta\Sigma$  produces an ENOB of 7.38b with an input of 2.5mW<sub>p-p</sub>, yielding an ENOB of 8.68b at full scale (6mV<sub>p-p</sub>), whereas the theoretical ENOB of the  $\Delta\Sigma$  MASH 1-1-1 is of 10.8b with an OSR of 25 ( $F_s = 500kHz$ ). At this OSR, the ENOB is limited by the THD and the noise of the AFE. The power consumption of the AFE, the  $\Delta\Sigma$  and the DF is of 11.2 $\mu$ W/ch. The measured performance of the circuits is summarized in Fig. 29.4.5.

A Long-Evans rat with virally mediated ChR2 cortical expression (four sites in primary motor cortex injected with 0.3 $\mu$ l of AAV2/6 hSyn-ChR2-EGFP-Nav1.2ts viral construction, title of  $8 \times 10^{12}$  GC/ml) was used to stimulate and record within two *in vivo* experiments wherein the rat was anesthetized with ketamine/xylazine. First, a micropipette (3 $\mu$ m tip) was lowered in the VPM region of the brain (4.2mm depth). Secondly, an optrode (8 $\times$ 0.5M $\Omega$  electrodes, 1 $\times$  fiber) was lowered in the cerebral motor cortex (1.5mm depth). All protocols were performed in accordance with Canadian Council on Animal Care Guidelines.

As seen in Fig. 29.4.6 (top left), the chip enclosed within the wireless test platform was used to record spontaneous activity of maximum amplitude of 1.3mV<sub>p-p</sub> and a noise floor of 17 $\mu$ V<sub>rms</sub> using the glass micropipette. Figure 29.4.6 (top right) presents several clusters of spikes collected within 3 recording sessions. Figure 29.4.6 (bottom left) presents the wireless platform, which is using the IC, during the *in vivo* experiment and Fig. 29.4.6 (bottom right) shows the evoked neural activity acquired after optical stimulation through the optrode. The platform concept and IC are shown in Fig. 29.4.7.

### Acknowledgments:

We thank CMC Microsystems for supporting the chip fabrication. This project is funded by the Natural Sciences and Engineering Research Council of Canada (NSERC), the *Fonds de Recherche du Québec - Nature et Technologies* (FRQNT) and the Weston Foundation.

### References:

- [1] G. Buzsáki, et al., "Tools for Probing Local Circuits: High-Density Silicon Probes Combined with Optogenetics," *Neuron*, vol. 86, no. 1, pp. 92–105, 2015.
- [2] H.-M. Lee, et al., "A Power-Efficient Switched-Capacitor Stimulating System for Electrical/Optical Deep Brain Stimulation", *IEEE JSSC*, vol. 50, no. 1, pp. 360-374, 2015.
- [3] W. Wattanapanitch, et al., "An Energy-Efficient Micropower Neural Recording Amplifier," *IEEE T BioCAS*, vol. 1, no. 2, pp. 136-147, 2007.
- [4] L. Wang, et al., "An 18 $\mu$ W 79dB-DR 20KHz-BW MASH  $\Delta\Sigma$  Modulator Utilizing Self-Biased Amplifiers for Biomedical Applications", *IEEE CICC*, 4 pages, 2011.
- [5] H. Kassiri, et al., "All-Wireless 64-Channel 0.013mm<sup>2</sup>/ch Closed-Loop Neurostimulator with Rail-to-Rail DC Offset Removal", *ISSCC Dig. Tech. Papers*, pp. 452-453, 2017.
- [6] C.-H. Chen, et al., "An Integrated Circuit for Simultaneous Extracellular Electrophysiology Recording and Optogenetic Neural Manipulation", *IEEE TBME*, vol. 64, no. 3, pp. 557-568, 2017.



**Figure 29.4.1:** Circuit schematic of the AFE using folded-cascode OTA with source-degeneration resistors  $R_1$  and  $R_2$ .



**Figure 29.4.3: CIC4 DF with optimized binary precision between the stages, for up to 14b output resolution.**



**Figure 29.4.5:** Experimentally measured performance (top); Electrophysiological recording and optogenetic IC comparison (bottom).



**Figure 29.4.2:**  $\Delta\Sigma$  MASH 1-1-1 using the pre-subtraction module (2<sup>nd</sup> & 3<sup>rd</sup> stages) and a bias duty-cycling technique.



**Figure 29.4.4: Circuit schematic of the LED driver (top); regulated current for 3 different types of LEDs (bottom).**



**Figure 29.4.6:** Neural activity collected with a glass micropipette (top) and wirelessly using an optrode (bottom).



Figure 29.4.7: Die micrograph (right); system-level concept using the IC within an optogenetic platform (left).

## 29.5 A mm-Sized Free-Floating Wirelessly Powered Implantable Optical Stimulating System-on-a-Chip

Yaoyao Jia<sup>1</sup>, S. Abdollah Mirbozorgi<sup>1</sup>, Byunghun Lee<sup>1</sup>, Wasif Khan<sup>2</sup>, Fatma Madi<sup>2</sup>, Arthur Weber<sup>2</sup>, Wen Li<sup>2</sup>, Maysam Ghovalloo<sup>1</sup>

<sup>1</sup>Georgia Institute of Technology, Atlanta, GA

<sup>2</sup>Michigan State University, East Lansing, MI

Thanks to its cell-type specificity, high spatiotemporal precision, and reversibility, optogenetic neuromodulation has been widely utilized in brain mapping, visual prostheses, psychological disorders, Parkinson's disease, epilepsy, and cardiac electrophysiology [1]. While a variety of optical neural interfaces have been developed, most have substantial limitations due to their size and tethering, needed to either deliver light or electricity, which may restrict the animal movements and bias the results, particularly in behavioral studies. In contrast, wirelessly powered optogenetic interfaces improve accuracy, reliability, and validity of the outcomes by eliminating tethers. Recently, a few wirelessly powered optogenetics approaches have been reported with impressive reduction in size of the implant [2]. However, their practical application is impeded by requiring high operating frequencies in GHz range, which increases the risk of exposure to unsafe electromagnetic specific absorption rates (SAR), resulting in excessive heat generation. They also lack proper control over optical stimulus characteristics. Towards this end, we propose a practical mm-sized Free-Floating Wirelessly-powered Implantable Optical Stimulating (FF-WIOS) SoC to not only eliminate the tethering effects but also reduce the level of invasiveness and SAR in the tissue.

Figure 29.5.1 shows a conceptual view of the FF-WIOS that is intended to be freely distributed on a desired surface of the brain that is encompassed by a high quality-factor (Q) resonator ( $L_{Res}$ ) along with a passive mockup that shows device microassembly, and a fully-functional active prototype on a PCB. Each FF-WIOS implant consists of an ASIC, the backside of which has micromachined reflectors, a receiver coil (LRx) made of bonding wire, and four SMD capacitors. The ASIC also provides mechanical support for a 4x4  $\mu$ LED array that are positioned on a transparent flex-substrate under the reflectors. To deliver >2mW in the near-field region to the FF-WIOS within  $L_{Res}$  without surpassing the SAR limit, a 3-coil inductive link is geometrically optimized to homogeneously distribute the magnetic field within the high-Q resonator that is implanted under the scalp. The Tx coil ( $L_{Tx}$ ) is embedded in a headstage on top of the subject's head and driven by a power amplifier (PA) that can also modulate the carrier amplitude to send forward commands that adjust stimulation parameters.

Figure 29.5.2 shows the overall FF-WIOS system architecture. A power-management block supplies the rest of the SoC with a regulated,  $V_{REG}=1.8V$ , that follows a capless LDO and voltage doubler, whose main task is to continuously charge a 10 $\mu$ F storage cap,  $C_{LED}$ , which is then discharged into a selected  $\mu$ LED. The timing control block generates a stimulation control signal,  $Stim$ , with adjustable width and duty cycle, for capacitor charge and discharge. A current limiter adjusts the optical stimulation light intensity by setting an upper bound to the current flowing through the target  $\mu$ LED. In the forward data telemetry block [3], a pulse-position-modulated clock/data recovery (PPM-CDR) circuit recovers synchronized clock/data from the on-off-keyed (OOK) coil voltage,  $V_{COIL}$ . When the incoming data stream is matched with a predefined 10b preamble, data is stored in a 12b register for serial-to-parallel (S2P) conversion. The load-shift-keying (LSK) back telemetry is adopted for closed-loop power control by sensing  $V_{COIL}$  amplitude.

A 3-coil inductive link is optimized based on [4] by considering the power conversion efficiency (PCE) of the doubler vs. frequency. By maximizing the link power transfer efficiency (PTE), which is a combination of PTE of the inductive link and PCE of the doubler, 60MHz is selected as the carrier frequency, with the results presented in Fig. 29.5.3. Both power link and tissue layers are modeled in HFSS environment, with  $L_{Rx}$  and  $L_{Res}$  implanted over the brain and skull, respectively. The SAR is simulated at 0.178W/kg for delivering of 2.7mW to the doubler, which is well below the limit of 1.6W/kg. Since the required data rate is low, the OOK data transmission does not affect the power link optimization.

Figure 29.5.4 shows the schematics of the OOK demodulator, PPM-CDR, voltage doubler, and capless LDO, along with measured data transmission and  $C_{LED}$  charging and discharging waveforms. A PPM signal,  $S_{PPM}$ , is extracted from  $V_{COIL}$ ,

and converted to clock,  $CLK$ , through a frequency divider, DFF<sub>1</sub>.  $CLK$  controls amplitude of  $V_{PPM}$  by alternately charging and discharging  $C_3$ . If positioning ratio among three pulses of  $S_{PPM}$  is 4:1,  $V_{PPM}$  exceeds  $V_{REF2}$  during  $CLK=1$ , leading to  $DATA=1$ . Otherwise,  $DATA=0$  when the positioning ratio is 1:4.  $V_{COIL}$  is OOK-demodulated at 50kb/s to generate  $S_{PPM}$ , which is converted to synchronized  $CLK$  and  $DATA$ . The stimulation parameters are set only once. The voltage doubler has a built-in charging cell to charge  $C_{LED}$ , which is disabled when  $STIM$  is set to high. A 1.8V, 300 $\mu$ A capless LDO, composed of an error amplifier, a high output swing second stage with a compensation capacitor,  $C_4$ , and a power NMOS,  $N_7$ , is used to reduce the number of off-chip capacitors. In the measurements, stimulation is applied with pulse width of 2ms and a frequency of 10Hz. The current limiter sets  $I_{LED}$  to a desired value up to 10mA.  $C_{LED}$  is discharged during stimulation and recharged to the target voltage of 5V within 30ms without dropping  $V_{REG}$ .

Figure 29.5.5 shows the FF-WIOS *in vitro* and *in vivo* experiment setups. Fresh sheep brain and skull are used to realistically model the tissue layers. The FF-WIOS device was prepared for *in vitro* test by mounting the FF-WIOS ASIC and off-chip capacitors on a flexible Polyimide substrate with 2x2  $\mu$ LED array (Cree TR2227TM) on the backside, resulting in a 4.5mm FF-WIOS prototype.  $L_{Rx}$  made of magnet wire, is located around the Polyimide substrate. The FF-WIOS ASIC was directly wire-bonded onto the substrate.

*In vivo* animal experiments were conducted to verify the functionality of the FF-WIOS in live neural tissue within the brain of anesthetized rats. The FF-WIOS ASIC was mounted on a rigid board (14x8mm<sup>2</sup>) with off-chip capacitors on top and a 2x2  $\mu$ LED array (OSRAM) on the backside. The first headstage prototype is composed of a class-E PA with a switch to modulate the Tx carrier amplitude. A proof-concept study was conducted using this PA to deliver power to  $L_{Rx}$  with 1.5mm diameter to energize the FF-WIOS ASIC. Two adult Sprague Dawley rats (male & female, 600–650g) received AAV virus (AAV-CaMKIIa-hChR2(H134R)-mCherry) injection in the primary visual cortex ( $V_1$ ) to express neurons with light sensitive channelrhodopsin-2 (ChR2). Under anesthesia, unilateral optical stimulation was performed on only one  $V_1$  lobe of each rat, during which the FF-WIOS system drove a  $\mu$ LED with a 2ms pulse train at 2.5Hz with  $I_{LED}$  of 2.5mA or 10mA. Local field potentials (LFPs) with below- and above-threshold optical stimulation were recorded from the tested  $V_1$  using a penetrating tungsten electrode, and acquired/amplified using an Intan RHD2164 system. Results in Fig. 29.5.6 show that higher  $I_{LED}$  enables higher light intensity ( $\geq 10\text{mW/mm}^2$ ), leading to stronger LFP variations in the target tissues. The instantaneous phases of the light-evoked LFPs in a 1-to-25Hz window from 200 individual trials were differentiated with colors, aligned based on the stimulus ON time, and stacked. With 10mA  $I_{LED}$ , phase-locked (almost deterministic) neuronal oscillations were observed for ~100ms without latency, while 2.5mA  $I_{LED}$  did not induce significant phase synchronization. Additionally, an increase in power spectral density (PSD) was observed in a short time window following the optical stimulation. As expected, 10mA  $I_{LED}$  resulted in a bigger increase in PSD than 2.5mA optical stimulation.

### Acknowledgments:

The authors would like to thank P. Yeon and U. Guler for their help with schematic design, and Z. Wang for technique support on the software.

### References:

- [1] J. Rivnay, et al., "Next-Generation Probes, Particles, and Proteins for Neural Interfacing," *Science Advances*, vol. 3, no. 6, June 2017.
- [2] K.L. Montgomery, et al., "Wirelessly Powered, Fully Internal Optogenetics for Brain, Spinal and Peripheral Circuits in Mice," *Nature Methods*, vol. 12, no. 10, pp. 969–974, Aug. 2015.
- [3] H.M. Lee, et al., "A Power-Efficient Switched-Capacitor Stimulating System for Electrical/Optical Deep Brain Stimulation," *IEEE JSSC*, vol. 50, no. 2, pp. 360–374, Jan. 2015.
- [4] S.A. Mirbozorgi, et al., "Robust Wireless Power Transmission to mm-Sized Free-Floating Distributed Implants," *IEEE T BioCAS*, vol. 11, no. 3, pp. 692–702, June 2017.
- [5] G.G. Turcotte, et al., "A Wireless Headstage for Combined Optogenetics and Multichannel Electrophysiological Recording," *IEEE T BioCAS*, vol. 11, no. 1, pp. 1–14, Feb. 2017.
- [6] S. Park, et al., "Soft, Stretchable, Fully Implantable Miniaturized Optoelectronic Systems for Wireless Optogenetics," *Nat. Biotechnol.*, vol. 33, no. 12, pp. 1280–1286, Dec. 2015.



Figure 29.5.1: A conceptual view of the FF-WIOS system with a passive mockup, showing its microassembly, and a wirelessly powered prototype with a 4x4 LED array on a  $3.3 \times 1.2 \text{ cm}^2$  PCB.



Figure 29.5.2: Overall architecture of the FF-WIOS system-on-a-chip for wireless optogenetic stimulation.



Figure 29.5.3: 3-coil link configuration with SAR stimulation in HFSS, optimization algorithm flowchart, PTE simulation and measurement results, and specifications of optimized coils at 60MHz.



Figure 29.5.4: Schematic diagrams of OOK demodulator and PPM-CDR in forward data telemetry block, and voltage doubler and capless LDO in power-management block, along with transient measurement results.



Figure 29.5.5: In vitro and in vivo measurement setups with two FF-WIOS prototypes, one on a flexible polyimide substrate ( $3.5 \times 3 \text{ mm}^2$ ) and the other on rigid FR4 ( $14 \times 8 \text{ mm}^2$ ).



Figure 29.5.6: In vivo experiment results from an anesthetized rat with above threshold (left,  $I_{LED}=10 \text{ mA}$ ) and below threshold (right,  $I_{LED}=2.5 \text{ mA}$ ) optical stimulation.



Figure 29.5.7: Die micrograph, summary of FF-WIOS specifications, and benchmarking table.

## 29.6 A 92dB Dynamic Range Sub- $\mu$ V<sub>rms</sub>-Noise 0.8 $\mu$ W/ch Neural-Recording ADC Array with Predictive Digital Autoranging

Chul Kim, Siddharth Joshi, Hristos Courellis, Jun Wang, Cory Miller, Gert Cauwenberghs

University of California, San Diego, La Jolla, CA

High-density multi-channel neural recording is critical to driving advances in neuroscience and neuroengineering through increasing the spatial resolution and dynamic range of brain-machine interfaces. Neural-signal-acquisition ICs have conventionally been designed composed of two distinct functional blocks per recording channel: a low-noise amplifier front-end (AFE), and an analog-digital converter (ADC) [1,2]. Hybrid architectures utilizing oversampling ADCs with digital feedback [3-5] have seen recent adoption due to their increased power and area efficiency. Still, input dynamic range (DR) is relatively limited due to aggressive supply voltage scaling and/or kT/C sampling noise. This paper presents a neural-recording ADC chip with 92dB input dynamic range and 0.99 $\mu$ V<sub>rms</sub> of noise at 0.8 $\mu$ W power consumption per channel over 500Hz signal bandwidth, owing to 1) a predictive digital autoranging (PDA) scheme in a hybrid analog-digital 2<sup>nd</sup>-order oversampling ADC architecture, 2) no specific sampling process through capacitors, avoiding kT/C noise altogether. Digitally predicting the analog input at 12b resolution from a 1b quantization of the continuously integrated residue at effective 32 oversampling ratio (OSR), the PDA handles a  $\pm$ 130mV electrode differential offset (EDO) and recovers from  $>$ 200mV<sub>pp</sub> transient artifacts within <1ms. Furthermore, using digital circuits for integration ensures the architecture benefits from process scaling and the resulting compactness makes it suitable for incorporation in high-density recording arrays.

Figure 29.6.1 presents the system diagram circuit architecture for one of 16 ADC channels in the neural-signal-acquisition IC, with the analog integrator and comparator highlighted. The digital feedback along with the continuous analog integration implements a 2<sup>nd</sup>-order predictive loop accommodating for potentially large offset and slope at the input, such as EDO in the DC-coupled input and higher-frequency content in the signal at relatively low OSR. Unlike conventional 2<sup>nd</sup>-order delta-sigma modulation, the input enters the second integrator, where zero input to the first integrator ( $u = 0$ ) ensures stable saturation-free loop dynamics with only 1b quantization (Fig. 29.6.1 top). The resulting 1<sup>st</sup>-order differentiation in the signal transfer function produces 1<sup>st</sup>-order noise-shaping of the quantizer in the input; however, the extra loop gain contributed by the first digital integrator leads to improved resolution owing to digital prediction improving with OSR. The input dynamic range and transient response of the ADC loop are substantially improved by radix-2 autoranging of the quantizer, in which the history of the quantizer bits  $D[n]$  triggers either a factor two expansion or contraction in the digital feedback from the quantizer  $y[n]$ . A 3b exponent  $e[n]$  covers 7 octaves (1, 2, ..., 128) in digital gain, where a run of five successive decisions with identical polarity increments the exponent expanding the range, whereas a run of three alternating polarity decisions decrements the exponent contracting the range. The combination of digital prediction and radix-2 autoranging constitutes PDA. A reference-chopped 12-bit 6b+6b segmented DAC reconstructs  $p[n]$ , the predicted analog value of the chopped input, with the resulting residue  $x[n] - p[n]$  unchopped to baseband for continuous-time integration onto  $C_{INT}$  and quantization (Fig. 29.6.1 bottom). The digital prediction  $p[n]$  in turn is obtained as the instantaneous sum of the digital feedback  $y[n]$  and its running accumulation, completing the 2<sup>nd</sup>-order loop. A radix-2 variable-step up/down counter implements the update in  $p[n]$  in two phases: a double increment/decrement step at the counter's binary input position  $e[n] + 1$ , followed by a retracing step with opposite polarity at input position  $e[n]$  just before the next cycle. The 16 channels on-chip share common reference, bias and control signals, and their outputs  $D_{1..16}[n]$  are daisy-chained at the output to enable higher channel counts through cascaded multi-chip configuration.

A 2-stage fully differential amplifier (Fig. 29.6.2 top left) with two independent stages of common-mode feedback feeds into a 1.35pF integration capacitor  $C_{INT}$ . Current biases for  $I_{B1}$  and  $I_{B2}$  are set to 375nA and 25nA, respectively. Current-reusing nMOS and pMOS input pairs in the first stage boost transconductance to 22 $\mu$ S for improved NEF, while 600mV<sub>pp</sub> output swing at 0.8V supply in the second stage increases spurious-free dynamic range. The simulated signal gain of the integrator is greater than 46dB near the 32kHz chopping frequency (Fig. 29.6.2

center left). A two-stage comparator (Fig. 29.6.2 top right) performs 1b quantization. Decision time ranges from 1.5 to 2 $\mu$ s depending on input amplitude, dominated by capacitive loading ( $C_T = 20fF$ ) of the first-stage current-starved ( $I_C = 20nA$ ) pre-amplifier. Each of two differential segmented 6b+6b DACs is implemented with two 64-element custom arrays of 2fF unit capacitors  $C_0$ , bridged by 4% larger capacitor  $C_0'$  (Fig. 29.6.2 bottom left). Timing of the two-phase updates in the digital prediction state variable  $p$  is triggered by initiation and settling of the comparator output (Fig. 29.6.2 bottom right).

Figure 29.6.3 shows the measured input-referred noise of the combined front-end and ADC, with input shorted to the reference (IN = REF in Fig. 29.6.1). Chopping above 8kHz reduces the noise density below 50nV/ $\sqrt{Hz}$ , resulting in 0.99 $\mu$ V<sub>rms</sub> integrated input-referred noise and 1.81 noise efficiency factor (NEF) at 32kHz chopping frequency and 1 $\mu$ A supply. The measured effect of PDA on spectral and transient response is highlighted in Fig. 29.6.4. Without PDA, the response to a large step transient is slew-rate-limited due to unity increments/decrements in the digital feedback. With PDA, measurements show a 30 $\times$  bandwidth improvement for 4mV amplitude signals, and <1ms recovery to  $\pm$ 100mV input transients. The DC-coupled input is capable of capturing slow potentials (<0.1Hz) while accommodating EDO up to  $\pm$ 130mV. For larger EDO, AC-coupled operation (shown for reference in Fig. 29.6.4 top) is obtained by connecting the DC-coupled input through a pair of external series capacitors. The measured effect of PDA on increasing input dynamic range is shown in Fig. 29.6.5. PDA extends the input signal range, at greater than 50dB SNDR, by 22dB, approaching the full-scale range of the DAC, covering 92dB input dynamic range (Fig. 29.6.5 bottom). The SNDR improvements at large input signal amplitude reaching 66dB result from both reduced spurs and reduced noise floor owing to PDA (Fig. 29.6.5 top). The measured input impedance is greater than 26M $\Omega$  at 32kHz chopping frequency.

*In vivo* local-field potential (LFP) recordings using the 16-channel neural-acquisition IC connecting to a NeuralLynx microwire electrode array inserted in frontal cortex of a marmoset primate (*Callithrix jacchus*) are shown in Fig. 29.6.6 (top), resolving slow potentials (<0.1Hz) of 200mV<sub>pp</sub> amplitude comparable to the ECoG signal range indicative of subject arousal state that are often missed by AC-coupled commercial neural instrumentation unless with severe degradation in SNR [6]. Comparison of key metrics with the state-of-the-art in neural recording ICs is given in the Table (Fig. 29.6.6 bottom). In addition to NEF, the neural ADC achieves a power efficiency factor (PEF) of 2.6, almost a fourfold improvement among integrated front-end ADCs reported in the literature. The 16-channel neural ADC array measures 1 $\times$ 1mm<sup>2</sup> in 65nm CMOS, with 0.024mm<sup>2</sup> per channel (Fig. 29.6.7).

### References:

- [1] W. M. Chen, et al., "A Fully Integrated 8-Channel Closed-Loop Neural-Prosthetic CMOS SoC for Real-Time Epileptic Seizure Control," *IEEE JSSC*, vol. 49, no. 1, pp. 232-247, 2014.
- [2] H. Chandrasekaran and D. Markovic, "A 2.8 $\mu$ W 80mV<sub>pp</sub>-Linear-Input-Range 1.6GQ-Input Impedance Bio-Signal Chopper Amplifier Tolerant to Common-Mode Interference up to 650mV<sub>pp</sub>," *ISSCC Dig. Tech. Papers*, pp. 448-449, 2017.
- [3] R. Muller, et al., "A Minimally Invasive 64-Channel Wireless  $\mu$ ECoG Implant," *IEEE JSSC*, vol. 50, no. 1, pp. 344-359, 2015.
- [4] H. Kassiri, et al., "All-Wireless 64-Channel 0.013mm<sup>2</sup>/ch Closed-Loop Neurostimulator with Rail-to-Rail DC Offset Removal," *ISSCC Dig. Tech. Papers*, pp. 452-453, 2017.
- [5] B. C. Johnson, et al., "An Implantable 700 $\mu$ W 64-Channel Neuromodulation IC for Simultaneous Recording and Stimulation with Rapid Artifact Recovery," *IEEE Symp. VLSI Circuits*, pp. C48-C49, 2017.
- [6] J.A. Hartings, et al., "Recovery of Slow Potentials in AC-Coupled Electrocorticography: Application to Spreading Depolarizations in Rat and Human Cerebral Cortex," *J. Neurophysiology*, vol. 102, no. 4, pp. 2563-2575, 2009.



Figure 29.6.1: System diagram and circuit architecture of predictive digital autoranging (PDA) neural ADC.



Figure 29.6.2: PDA neural ADC circuit implementation and timing waveforms.



Figure 29.6.3: Measured noise spectral density and integrated noise for varying chopping frequency and supply current.



Figure 29.6.4: Measured large-signal bandwidth and transient response, with and without PDA.



Figure 29.6.5: Measured output, signal-to-noise-and-distortion ratio, and dynamic range, with and without PDA.

|                                         | JSSC14 [1]        | ISSCC17 [2]      | JSSC15 [3]      | ISSCC17 [4]  | VLSI17 [5] | This work              |
|-----------------------------------------|-------------------|------------------|-----------------|--------------|------------|------------------------|
| Power/Ch (μW)                           | 0.97              | 2.8              | 2.3             | 0.63         | 8          | 0.8                    |
| Supply voltage (V)                      | 1.8               | 1.2              | 0.5             | 1.2          | 1          | 0.8                    |
| Noise density (nV/√Hz) <sup>a</sup>     | 63                | 127              | 58              | 101          | 71         | 44                     |
| NEF                                     | 1.77 <sup>b</sup> | 7.4 <sup>b</sup> | 4.76            | 2.86         | 7.8        | 1.81                   |
| PEF (NEF <sup>2</sup> V <sub>DD</sub> ) | 5.6 <sup>b</sup>  | 66 <sup>b</sup>  | 11.3            | 9.8          | 60.8       | 2.6                    |
| ENOB (bits)                             | 9.57              | --               | --              | 11.7         | 10.2       | 10.7                   |
| Input dynamic range (dB)                | --                | 81 <sup>c</sup>  | 50 <sup>c</sup> | --           | 90         | 92                     |
| EDO range (mV <sub>pp</sub> )           | N/A <sup>d</sup>  | N/A <sup>d</sup> | 100             | rail-to-rail | 100        | 260 / N/A <sup>d</sup> |
| Rapid recovery                          | no                | no               | no              | no           | yes        | yes                    |
| Area/Ch (mm <sup>2</sup> )              | 0.09              | 0.069            | 0.025           | 0.013        | 0.06       | 0.024                  |
| Technology (nm)                         | 180               | 40               | 65              | 130          | 180        | 65                     |

<sup>a</sup>Input-referred noise/sBW, differential configuration

<sup>b</sup>Front-end amplifier only, excluding ADC

<sup>c</sup>SNDR=0dB estimated from input-referred noise

<sup>d</sup>AC-coupled

Figure 29.6.6: In vivo LFP recordings from marmoset frontal cortex, and metric comparison with state of the art.



Figure 29.6.7: 16-channel PDA neural ADC IC micrograph and single channel detail.

## 29.7 A 110dB-CMRR 100dB-PSRR Multi-Channel Neural-Recording Amplifier System Using Differentially Regulated Rejection Ratio Enhancement in 0.18μm CMOS

Sehwan Lee<sup>1</sup>, Arup K. George<sup>1</sup>, Taeju Lee<sup>2</sup>, Jun-Uk Chu<sup>3</sup>, Sungmin Han<sup>4</sup>, Ji-Hoon Kim<sup>5</sup>, Minkyu Je<sup>2</sup>, Junghyup Lee<sup>1</sup>

<sup>1</sup>Daegu Gyeongbuk Institute of Science and Technology, Daegu, Korea

<sup>2</sup>KAIST, Daejeon, Korea

<sup>3</sup>Korea Institute of Machinery and Materials, Daegu, Korea

<sup>4</sup>Korea Institute of Science and Technology, Seoul, Korea

<sup>5</sup>Seoul National University of Science and Technology, Seoul, Korea

Multi-channel neural-recording amplifier systems have evolved into the method of choice for analyzing neurophysiological behavior, and are leading to a deeper understanding of the human brain [1-4]. Such systems operate from a noisy supply and ground, especially when they are powered wirelessly. As shown in Fig. 29.7.1, the amplifiers ought to be low-noise, low-power, and resilient against environmental noise and interferences that are capacitively coupled from the power lines (220V/60Hz). Specifications-wise, these requirements translate into high CMRR, TCMRR, and PSRR. TCMRR (total CMRR) is a more realistic specification than CMRR as it includes the effect of the impedances of both electrodes ( $Z_e$ ) and the amplifier input ( $Z_{in}$ ) as well. In fact, the TCMRR should be >70dB for reliable detection of a 5μV<sub>rms</sub> neural signal [1].

The TCMRR of conventional differential neural amplifiers (Fig. 29.7.1(a)) is fundamentally limited by the mismatch between the signal input  $V_{in}$  and the shared reference input  $V_{ref}$  [1]. To minimize this mismatch, prior works avoid  $V_{ref}$  sharing by using similar single-ended amplifiers for both  $V_{in}$  and  $V_{ref}$  (Fig. 29.7.1(b)) [1,4]. However, this method gets less effective when large  $Z_e$  values accentuate the mismatch between the potential dividers formed at the amplifier inputs (between  $Z_e$  and  $Z_{in}$ ). TCMRR>70dB is almost unattainable for  $Z_e>50k\Omega$  [1]. Secondly, as their PSRR performance deteriorates rapidly after several hundreds of Hz [1], single-ended amplifiers pose a significant impediment when the supply and ground are noisy. To mitigate these issues, we report a multi-channel neural recording amplifier system employing a differentially regulated rejection-ratio enhancement (DR<sup>E</sup>) scheme (Fig. 29.7.1(c)). The key idea of this method is to superimpose the system ground with the common-mode signal  $V_{cm}$ , so that potential division does not happen between  $Z_e$  and  $Z_{in}$  at the amplifier inputs. Secondly, a floating voltage  $V_{dc}$ , referenced from this ground, is used as the amplifier supply. As a result, common-mode disturbances cannot affect the differential output even in the presence of input-impedance mismatches or when the CM rejection capability of the amplifier becomes limited. Thus, the TCMRR of the amplifier could theoretically be infinite, limited only by the accuracy of the common-mode superimposition. Furthermore, as the amplifier rails are isolated from the system rails, a high PSRR follows as a natural consequence.

Figure 29.7.2 shows the architecture of the 16-channel neural amplifier system employing DR<sup>E</sup>. A nested current-recycling technique, where current is reused across stacked channels and also within an LNA (devices  $M_1$ - $M_4$ ), is used to strike an optimum of power, noise efficiency and gain. Rejection-ratio enhancement is achieved using a frequency-controlled differential regulator (FCDR) that isolates the local rails ( $V_{dd\_lna}$ ,  $V_{mid}$  and  $V_{ss\_lna}$ ) from the system rails, while making them closely track the  $V_{cm}$  variations. FCDR uses a high-gain amplifier  $G_{amp}$ , that superimposes the common-mode signal  $V_{cm}$  derived from an additional electrode on the floating supply  $V_{reg}$  through a unity-gain feedback network.  $V_{reg}$  is derived from a supply- and ground-independent frequency reference  $F_{osc}$ . Thus, the DR<sup>E</sup> can improve both CMRR as well as PSRR. Furthermore, the frequency reference from which  $V_{reg}$  is derived can be used as a system clock for avoiding the use of a bulky crystal.

Figure 29.7.3 shows the implementation of the frequency-controlled differential regulator (FCDR) that generates the local rail of the neural amplifier stack  $V_{reg}$ , which is also the control voltage of the ring VCO. Through the negative-feedback loop formed by the differential PD and the differential integrator, the VCO is locked to a frequency proportional to the  $1/R_{ref}C_{ref}$ , making  $V_{reg}$  independent of supply and ground variations, enhancing the overall PSRR [5]. The integrator also delivers power to the neural amplifier stack via the transistors  $M_3$ - $M_4$ . The common-mode level of the integrator is forced to track the  $V_{cm}$  variations through the unity-gain feedback set by the OTA and the integrator. As a result, the local

rails of the amplifier stack,  $V_{dd\_lna}$ ,  $V_{mid}$  and  $V_{ss\_lna}$  also tracks  $V_{cm}$ , effectively superimposing  $V_{cm}$  on  $V_{reg}$ , enhancing the CMRR. However, the impedance  $Z_{cm}$  needs to be maximized to ensure that the  $V_{cm}$  is detected accurately regardless of the electrode impedance  $Z_e$ . To achieve this, a high-value pseudo-resistor  $R_{pseudo}$  is used to set the gate bias of the OTA. The differential regulator also drives a replica VCO that can be injection-locked to a low-phase-noise clock derived from the wireless power supply. This high-quality clock output can be used for communication purposes at the system level.

Implemented in a 0.18μm standard CMOS process, the neural amplifier system occupies an active area of 2.3mm<sup>2</sup> with individual channels occupying 0.075mm<sup>2</sup> as shown in Fig. 29.7.7. From a 1.5V supply, the system consumes 51.7μW while a single channel consumes 0.69μW. As shown in Fig. 29.7.4(a), the DR<sup>E</sup> technique helps to achieve a CMRR greater than 110dB over a frequency range of 10Hz to 10kHz. The PSRR is >100dB over a frequency range of 10Hz to 4kHz and reaches 94dB at 10kHz. At 1kHz, the CMRR is increased from 67.8dB to 121dB, which is a 53dB improvement. As shown in Fig. 29.7.4(b), at 1kHz, the worst-case CMRR across all channels is 110dB, whereas the worst-case TCMRR at an electrode impedance of 100kΩ is 80dB. From 34dB, this amounts to an improvement of more than 46dB. Figure 29.7.5(a) shows that the neural amplifier flat band gain is 39.8dB over a bandwidth of 10Hz to 10kHz. As shown in Fig. 29.7.5(b), the input-referred noise PSD at 1kHz is 29nV/√Hz while integrated noise over a bandwidth of 10Hz to 10kHz is 3μV<sub>rms</sub>. Figure 29.7.5(c) shows the effectiveness of the differential regulator in preserving  $V_{reg}$ , and hence the frequency in the presence of supply variation. Over a supply range between 1.5 and 3.5V, the  $V_{reg}$  varies only by 5.7mV, which corresponds to 2580ppm/V. Over the same supply range, the frequency varies between 1.12 to 1.28MHz, equivalent to ±0.19%/V. Figure 29.7.5(d) shows the phase-noise measurements of the ring-VCO and the injection-locked replica VCO. Injection locking improves the close-in phase-noise of the replica VCO to be lower than -100dBc/Hz. At an offset of 100Hz, the improvement is more than 69dB. *In vivo* measurements on the subthalamic nucleus of an anesthetized Sprague Dawley rat recorded successfully using the DR<sup>E</sup> neural amplifier are shown in Fig. 29.7.6. From the performance benchmark shown in Fig. 29.7.6, the CMRR of the DR<sup>E</sup> neural amplifier is 30dB better, while TCMRR is 20dB better than the prior works. Similarly, the PSRR at 1kHz is 20dB better than those of the prior works. Other key performance figures such as NEF/PEF and input-referred noise are better or comparable to those of the prior works.

This work presents a neural recording system that employs a differentially regulated rejection-ratio enhancement scheme that improves the CMRR/TCMRR/PSRR performance by several tens of dBs over prior works, without the need for any bulky decoupling capacitors. Furthermore, the internal clock sources of the system can provide a high-quality clock for the system, obviating the need for bulky crystal oscillators. Thus, the DR<sup>E</sup> neural recording system paves a way for further miniaturization. Finally, though the system is presented in the context of neural recording, it is also suitable for other biomedical signal-acquisition applications where high CMRR/PSRR performance is essential.

### Acknowledgements:

This work was supported by the Basic Science Research Program (2017R1C1B2010672) and the Convergence Technology Development Program for Bionic Arm (2017M3C1B2048608) through the National Research Foundation of Korea as well as the DGIST R&D Program (17-BD-0404), all of them funded by the Korea government (MSIT).

### References:

- [1] K. A. Ng, et al., "A Low-Power, High CMRR Neural Amplifier System Employing CMOS Inverter-Based OTAs with CMFB Through Supply Rails," *IEEE JSSC*, vol. 51, no. 3, pp. 724-737, Mar. 2016.
- [2] R. Muller, et al., "A 0.013 mm<sup>2</sup> 5μW DC-Coupled Neural Signal Acquisition IC with 0.5V Supply," *ISSCC Dig. Tech. Papers*, pp. 302-303, Feb. 2011.
- [3] D. Han, et al., "A 0.45V 100-Channel Neural-Recording IC with Sub-μW/Channel Consumption in 0.18μm CMOS," *ISSCC Dig. Tech. Papers*, pp. 290-291, Feb. 2013.
- [4] C. M. Lopez, et al., "An Implantable 455-Active-Electrode 52-Channel CMOS Neural Probe," *IEEE JSSC*, vol. 49, no. 1, pp. 248-261, Jan. 2014.
- [5] J. Lee, et al., "A 4.7MHz 53μW Fully Differential CMOS Reference Clock Oscillator with -22dB Worst-Case PSNR for Miniaturized SoCs," *ISSCC Dig. Tech. Papers*, pp. 106-107, Feb. 2015.



Figure 29.7.1: Conceptual overview of the differentially regulated rejection-ratio enhancement scheme.



Figure 29.7.2: Architecture and LNA schematic of the 16-channel neural-recording amplifier system using a DR<sup>3</sup>E scheme.



Figure 29.7.3: Schematic and timing waveforms of the frequency-controlled differential regulator.



Figure 29.7.4: Measured CMRR, PSRR and TCMRR across all the 16 channels.



Figure 29.7.5: Measured gain-bandwidth and input-referred noise of the LNA (top), supply variation of the FCDR output voltage  $V_{REG}$  and the ring-VCO frequency  $F_{OSC}$  (bottom-left), and phase-noise performance of the ring-VCO and injection-locked replica VCO (bottom-right).

| Parameter                                       | ISSCC11 [2] | ISSCC13 [3]      | JSSC14 [4]   | JSSC16 [1]                          | This work                             |
|-------------------------------------------------|-------------|------------------|--------------|-------------------------------------|---------------------------------------|
| Technology [nm CMOS]                            | 65          | 180              | 180          | 65                                  | 180                                   |
| Supply Voltage [V]                              | 0.5         | 0.45             | 1.8          | 1.0                                 | 1.5                                   |
| Power per Channel [ $\mu$ W]                    | 5.04        | 0.73             | 7.02         | 3.28                                | $3.23^1$<br>$(0.69)^2$                |
| Mid-band Gain [dB]                              | N.A.        | 52               | 30-72        | 52.1                                | 39.8                                  |
| Operating Bandwidth [Hz]                        | 300-10k     | 1-10k            | 300-6k       | 1-8.2k                              | 10-10k                                |
| Channel Area [mm <sup>2</sup> ]                 | 0.013       | N.A.             | 0.088        | 0.042                               | 0.075                                 |
| Input Referred Noise [ $\mu$ V <sub>rms</sub> ] | 4.9         | 3.2              | 3.2          | 4.13                                | 3                                     |
| %THD (@ Input Amplitude)                        | N.A.        | 0.53<br>(0.5mVp) | 1<br>(18mVp) | 1<br>(0.7mVp)                       | $0.37$<br>$(2mVp)$<br>$(1.31/0.01)^2$ |
| NEF/PEF                                         | 5.99/17.96  | 1.57/1.12        | 3.08/17.13   | 3.19/10.2                           | $1.69/4.28^1$<br>$(1.31/0.01)^2$      |
| CMRR [dB]                                       | 75          | 73<br>@ 1kHz     | 60           | > 110<br>@ 1kHz                     | > 110<br>@ 1kHz                       |
| TCMRR [dB]                                      | N.A.        | N.A.             | N.A.         | > 59.6<br>@ 1kHz, $Z_s = 100\Omega$ | > 80<br>@ 1kHz, $Z_s = 100\Omega$     |
| PSRR [dB]                                       | 64          | 80<br>@ 1kHz     | 76           | 78<br>@ 1kHz                        | 101<br>@ 1kHz                         |

Figure 29.7.6: *In vivo* measurements on the subthalamic nucleus of a Sprague Dawley rat (top), performance summary and comparison (bottom).



Figure 29.7.7: Die micrograph of the DR'E multi-channel neural-recording amplifier system.

## 29.8 A 43.4 $\mu$ W Photoplethysmogram-Based Heart-Rate Sensor Using Heart-Beat-Locked Loop

Do-Hun Jang, SeongHwan Cho

KAIST, Daejeon, Korea

Photoplethysmogram (PPG) sensors have gained great popularity in recent years as they can easily obtain heart rate (HR) in wearable devices such as smart watches and smart rings. However, one of the biggest problems for PPG sensors is their large power consumption, as wearable devices are highly limited in its battery capacity. The power consumption of a PPG sensor is typically dominated by the LED driver, which requires several to a few tens of mA of current. Thus, many previous works are aimed at reducing the LED driver power [1-5]. The most widely used method is duty-cycling the LED by using a train of discrete pulses instead of always turning on the LED [1-4]. As a PPG signal has low bandwidth, the duty-cycle ratio of the LED can be as low as 1%. Another low-power method is compressive sampling, which exploits the sparse nature of PPG signals [5]. Although it can reduce the effective duty-cycle ratio down to 0.0125%, a critical problem is that a large power consumption is required in reconstructing the compressive-sampled signal. In this work, we present an ultra-low-power PPG sensor with a heartbeat-locked loop (HBLL) that turns on the LED only during the PPG peaks and thus achieves an effective duty cycle of 0.0175%. We also reduce the power consumption of the analog front-end (AFE) by using the HBLL, which is in contrast to previous works where AFE power is not duty-cycled. A prototype implemented in 0.18 $\mu$ m CMOS demonstrates 43.4 $\mu$ W of total power consumption with less than 2.1bpm error in heart rate.

The block diagram of the heart rate (HR) read-out IC (ROIC) is shown in Fig. 29.8.1. It consists of an AFE, a PPG-to-clock converter, and a digital HBLL. The current from the photodiode is first filtered and amplified by the AFE. Next, the PPG signal is converted to a digital clock signal that is synchronized to the heartbeat. The digital HBLL calculates heart rate and generates a narrow window which is locked to the heart rate. When the window is low, the LED is turned off and when the window is high, the LED is turned on using a train of discrete pulses for further duty-cycling. Compared to an always-on LED, the power consumption is reduced by a factor of  $D \times W/T$ , where  $W$  is the window on-time,  $T$  is the heartbeat interval, and  $D$  is the duty-cycle of the pulse train. Note that power-hungry circuits in the AFE are also powered down by the window, which further reduces the power consumption of the ROIC.

The schematic of the AFE and LED driver is shown in Fig. 29.8.2. The AFE consists of a DC current-cancellation circuit, a transimpedance amplifier (TIA), a switched-capacitor low-pass filter (SC-LPF), two sample-and-holds (S/H), and a comparator. The DC current-cancellation circuit implemented by using a cascode current mirror helps to avoid saturation in the subsequent amplifier by removing large DC offset. The photodiode current is amplified by the TIA, whose gain is programmable from 40k to 1M $\Omega$ , and then filtered by the SC-LPF. The output is sampled by the S/H and fed to a voltage comparator that compares two consecutive samples. The comparator output is high when the PPG signal increases and low when it decreases. Thus, a high-to-low transition is created when there is a peak in the PPG signal, which results in a digital clock signal that is synchronized with the PPG signal and the heart rate. To reduce the AFE power consumption, TIA and DC current cancellation are also turned off by the window. The comparator is also windowed and implemented in a dynamic topology to remove static power consumption. The LED current is controlled by comparing the PPG signal to reference levels to maintain the PPG signal within the desired range [5].

The block diagram and operation principles of the digital HBLL is shown in Fig. 29.8.3. The HBLL receives the clock converted from the PPG signal ( $PPG_{CLK}$ ) from the comparator and measures the peak-to-peak interval by using a counter. It then applies a variable moving average filter and estimates the heartbeat interval (HBI[7:0]). Based on the HBI, window generator centers the window at an estimated peak location that is previous peak location plus the estimated HBI. The size of the window is set to 100ms based on the statistical property of heart rate, as it is shown that the time difference between two consecutive heartbeats is less than  $\pm 50$ ms (pNN50) for more than 92.1% of the time [6]. If the peak does not occur within the two consecutive windows, then the HBLL increases the window size from 100 to 400ms. If the peak is still not within the enlarged window, then

HBLL goes into initial HBI calculation mode, during which the LED is just duty-cycled with train of pulses and without window. The initial HBI algorithm searches two consecutive peaks of the PPG signal and estimates the HBI. It also ensures that dicrotic notch is not mistaken for a PPG peak by using a refractory period during which the LED is turned off. When PPG peaks appear again, the HBLL regains lock in only two heartbeats.

A prototype chip is implemented in a 0.18 $\mu$ m CMOS process. To verify the functionality of the HBLL, heart rate was measured on a healthy male subject by using three devices simultaneously; the HBLL-PPG sensor, a PPG sensor without HBLL, and a commercial ECG sensor that serves as the golden reference. The measured results are shown in Fig. 29.8.4, where it can be seen that the HBLL-PPG is locked to the peaks of the PPG signal and to the time-delayed R-peaks of the ECG, where the time-delay is due to pre-ejection period and pulse transit time. To verify the fast re-locking property, an abrupt change in heart rate from 72 to 54bpm was made using a PPG source equipment. It can be seen that such change leads to two consecutive peak errors, which causes the window size to increase to 400ms. In this experiment, the peaks are still missing after the increase in the window size and hence the HBLL enters the initial HBI calculation mode and regains lock. The HBLL loses a total of three HBIs during this abrupt change, which can be considered trivial in practice.

The accuracy and power consumption of the HBLL sensor is shown in Fig. 29.8.5. The measurement was performed using a PPG source equipment. The worst-case bpm errors are 5bpm (3.2%) and 2.1bpm (1.3%) for sampling frequencies of 40Hz and 100Hz, respectively. The error is due to the time resolution in each sampling rate. When the duty-cycle ratio of the pulse train is 0.25% (i.e.  $D = 0.25\%$ ), the power consumption of the LED driver is reduced from 105 $\mu$ W, which is when only pulse-train duty-cycling is used, to 16 $\mu$ W, which is when the HBLL is used as well. The AFE power is also reduced from 192 to 25 $\mu$ W because the AFE is also turned on and off by the window. The power consumption of the digital HBLL is measured as 2.4 $\mu$ W. The performance of the HR ROIC is compared with other works in Fig. 29.8.6. It can be seen that the power consumption of the proposed work is the lowest.

### Acknowledgments:

This research was supported by the Nano-Material Technology Development Program through the NRF of Korea funded by the Ministry of Science, ICT and Future Planning (NRF-2016M3A7B4910637).

### References:

- [1] M. Tavakoli, et al., "An Ultra-Low-Power Pulse Oximeter Implemented with an Energy-Efficient Transimpedance Amplifier," *IEEE TBioCAS*, vol. 4, no. 1, pp. 27-38, Feb. 2010.
- [2] K. N. Glaros, et al., "A sub-mW Fully-Integrated Pulse Oximeter Front-End," *IEEE TBioCAS*, vol. 7, no. 3, pp. 363-375, June 2013.
- [3] A. Wong, et al., "A Low-Power CMOS Front-End for Photoplethysmographic Signal Acquisition with Robust DC Photocurrent Rejection," *IEEE TBioCAS*, vol. 2, no. 4, pp. 280-288, Dec. 2008.
- [4] E. S. Winokur, et al., "A Low-Power Dual-Wavelength Photoplethysmogram (PPG) SoC with Time-Varying Interferer Removal," *IEEE TBioCAS*, vol. 9, no. 4, pp. 581-589, Aug. 2015.
- [5] P. V. Rajesh, et al., "A 172 $\mu$ W Compressive Sampling Photoplethysmographic Readout with Embedded Direct Heart-Rate and Variability Extraction from Compressively Sampled Data," *ISSCC Dig. Tech. Papers*, pp. 386-387, Feb. 2016.
- [6] J. Mietus, et al., "The pNNx Files: Re-Examining a Widely Used Heart Rate Variability Measure," *Heart*, vol. 88, no. 4, pp. 378-380, 2002.



Figure 29.8.1: Block diagram of the HR ROIC.



Figure 29.8.3: Block diagram of the Heartbeat-Locked Loop and its operation principles.



Figure 29.8.5: Measured heart rate and power consumption of the AFE and LED driver with and without HBLL.



Figure 29.8.2: Schematic of the AFE and LED driver, and timing diagram.

## PPG &amp; HBLL-PPG &amp; ECG



Figure 29.8.4: Measured HBLL-PPG, PPG, and ECG signals and re-locking test.

|                               | [1]                          | [2]                          | [3]                          | [4]                          | [5]                             | This work                                                           |
|-------------------------------|------------------------------|------------------------------|------------------------------|------------------------------|---------------------------------|---------------------------------------------------------------------|
| Method                        | LED duty-cycled using pulses | compressive sampling (LED only) | LED duty-cycled using pulses & window, AFE duty-cycled using window |
| Technology                    | 1.5µm                        | 0.35µm                       | 0.35µm                       | 0.18µm                       | 0.18µm                          | 0.18µm                                                              |
| Supply Voltage                | 5V                           | 3.3V                         | 2.5V                         | 1.8V                         | 1.2V                            | 3.3V                                                                |
| Sampling Frequency            | 100Hz                        | 100Hz                        | 100Hz                        | 165Hz                        | 4Hz                             | 40Hz                                                                |
| Effective LED Duty Cycle      | 3%                           | 3%                           | 0.2%                         | 0.7%                         | 0.0125% <sup>3)</sup>           | 0.01-1% <sup>1)</sup>                                               |
| Power Consumption (ROIC)      | 400µW                        | 528µW                        | 600µW                        | 216µW                        | 172µW <sup>4)</sup>             | 29.1µW <sup>1), 2)</sup>                                            |
| Power Consumption (LED)       | 4400µW                       | 309-1360µW                   | N/A                          | 120-1125µW                   | 43-1200µW                       | 9-480µW <sup>1), 5)</sup>                                           |
| Heart Rate Error              | N/A                          | N/A                          | N/A                          | N/A                          | 10bpm                           | 5bpm                                                                |
| Integrated Feature Extraction | Yes (SpO2)                   | No                           | No                           | No                           | Yes (HR/HRV)                    | Yes (HR)                                                            |

1) Measured when locked at 60bpm using HBLL, duty cycle = 0.1-10% (40Hz), 0.25-25% (100Hz).  
2) Including AFE and HBLL power consumption.  
3) Calculated from the given data.  
4) ROIC power consumption without reconstruction.  
5) Using DCM03 (LED+photodiode), LED current: 10mA, Photodiode size: 2.5mm × 2.5mm.

Figure 29.8.6: Performance summary and comparison with other state-of-the-art PPG sensors.



Figure 29.8.7: Die micrograph.

# Session 30 Overview: *Emerging Memories*

## MEMORY AND TECHNOLOGY DIRECTIONS SUBCOMMITTEES



**Session Chair:**  
***Shinichiro Shiratake***  
*Toshiba Memory, Yokohama, Japan*



**Associate Chair:**  
***Edoardo Charbon***  
*EPFL, Neuchatel, Switzerland*

**Subcommittee Chair: *Leland Chang*, IBM, Yorktown Heights, NY**

**Subcommittee Chair: *Makoto Nagata*, Kobe University, Kobe, Japan**

Speed and power are major concerns in today's memory designs, and they are becoming particularly important with the emergence of high-performance computing and deep-learning applications.

In this session, multiple new developments in emerging memories will be presented. An 11Mb 40nm CMOS embedded RRAM macro with fast access and improved reliability by means of circuit techniques is proposed. A 28nm CMOS 1Mb MRAM with 2.8ns access time is presented, where circuit techniques are used to save write energy. An embedded 28nm CMOS 32kb 2T2MTJ MRAM is demonstrated achieving a 1.3ns access time through a novel sensing scheme with a sophisticated offset-canceling technique. Finally, a novel memory based on crystalline oxide semiconductor FETs, optimized for an embedded deep-learning engine, achieves sub-50ns read/write capability.



1:30 PM

**30.1 An N40 256Kx44 Embedded RRAM Macro with SL-Precharge SA and Low-Voltage Current Limiter to Improve Read and Write Performance**
*C-C. Chou*, TSMC, Hsinchu, Taiwan

In Paper 30.1, TSMC presents an 11Mb embedded RRAM macro fabricated in a 40nm CMOS process. It proposes a new sense amplifier with a 58% faster access speed, and a new write scheme to improve both endurance and retention performance.



2:00 PM

**30.2 A 1Mb 28nm STT-MRAM with 2.8ns Read Access Time at 1.2V VDD Using Single-Cap Offset-Cancelled Sense Amplifier and In-situ Self-Write-Termination**
*Q. Dong*, University of Michigan, Ann Arbor, MI and TSMC, San Jose, CA

In Paper 30.2, the University of Michigan presents a 1Mb STT-MRAM with a 2.8ns read-access time, offering an offset-canceling sense amplifier for fast access and a self-write-termination scheme to save write energy.



2:30 PM

**30.3 A 28nm 32Kb Embedded 2T2MTJ STT-MRAM Macro with 1.3ns Read-Access Time for Fast and Reliable Read Applications**
*T-H. Yang*, National Tsing Hua University, Hsinchu, Taiwan and TSMC, Hsinchu, Taiwan

In Paper 30.3, National Tsing Hua University presents a 32kb embedded 2T2MTJ MRAM macro with a 1.3ns read-access time. It adopts a continuous-recording-and-enhancement voltage sense amplifier to realize the world's fastest read operation.



2:45 PM

**30.4 A 20ns-Write 45ns-Read and 10<sup>14</sup>-Cycle Endurance Memory Module Composed of 60nm Crystalline Oxide Semiconductor Transistors**
*S. Maeda*, Semiconductor Energy Laboratory, Atsugi, Japan

In Paper 30.4 the Semiconductor Energy Laboratory proposes a memory comprising of crystalline oxide semiconductor FETs (OSFETs) compatible with CMOS processes. The memory, embedded in a deep learning engine, has a 1kb density enabled by a 60nm OSFET process, which can be written in 20ns and read in 45ns using 97.9pJ and 123pJ.

### 30.1 An N40 256Kx44 Embedded RRAM Macro with SL-Precharge SA and Low-Voltage Current Limiter to Improve Read and Write Performance

Chung-Cheng Chou, Zheng-Jun Lin, Pei-Ling Tseng, Chih-Feng Li, Chih-Yang Chang, Wei-Chi Chen, Yu-Der Chih, Tsung-Yung Jonathan Chang

TSMC, Hsinchu, Taiwan

RRAM is an attractive and low-cost memory structure for embedded applications due to the simplicity of the RRAM element (RE) and its compatibility with a logic process. A RRAM bit cell (Fig. 30.1.1) consists of an NMOS select transistor and a bipolar RE, which consists of a bottom electrode (BE), a transition metal-oxide film (Hi-K), a metal capping layer and a top electrode (TE). The memory cell operates as a 3-terminal device, including bit-line (BL), source-line (SL) and word-line (WL). BL is connecting to TE, SL is connecting to the source node of the select transistor and word-line (WL) is connecting to the gate of the select transistor. A common SL (CSL) architecture is adopted in this work. CSL allows two or more columns to share one source line. So that the column mux number for SL can be reduced, therefore macro area can be saved. In addition, SL can be implemented with a wider metal track due to the reduced SL count. Therefore, the SL resistance also can be reduced. However, a CSL architecture will result in a larger parasitic capacitance on SL. This paper presents an SL precharge scheme to deal this increased capacitance when reading from the CSL.

A fresh RRAM cell is equivalently a dielectric with extremely high resistance. It requires a one-time process, called forming, to become filamentary. During forming, a voltage of 2-3V is applied across the RE from the BL node to ionize and sweep the oxygen atoms towards the capping layer; this leaves oxygen vacancies to form a low-ohmic filament. A write operation can be a SET (write 1 into a bit-cell) or a RESET (write 0 into a bit-cell) operation. SET is similar to the forming process, but with a more moderate voltage bias. RESET is the reverse of a SET operation: a write bias is applied to the SL node. Oxygen atoms located near the TE will migrate back and re-occupy the vacancies, resetting the cell to HRS.

SET is to change a RE from a high-resistance state (HRS) into a low-resistance state (LRS). The minimum duration of the SET pulse width is determined by the slowest REs, but faster REs may be overstressed as a result. Techniques have been proposed to stop the write operation before REs are overstressed [1,2], by monitoring the cell current and stopping it when it reaches a target level. However, the sudden stop may not form a dense enough filament and impact data retention. To prevent RE over-stress and to generate a robust filament structure, a low-voltage write-current-limiting scheme (LV-WCLS) is presented.

The proposed scheme is depicted in Fig. 30.1.2(a).  $V_{BL}$  is the write bias applied to the selected cell, which is a 1T1R cell. Transistors N3 and N4 serve as current limiters. N5 and N6 are the switches to enable the N3 and N4 branches. The gate bias for N3 and N4 is generated by the closed loop formed by OP1. The closed loop will make N1 sustain the compliance current level  $I_{COMP}$  and keep the drain node of N1 at 0.1V. The combination of N3 and N4 is designed to be equivalent to N1, i.e. the resultant channel width of N3 and N4 should exactly equal to N1's. At the beginning of SET, the cell current is at a low level (HRS,  $< I_{COMP}$ ).  $V_D$ , the drain voltage of N3 and N4 will be less than 0.1V. The output of OP2, SW2, stays high and keeps N6 and N5 on, so that cell current flows through both N3 and N4. During SET the cell current and  $V_D$  will increase. When the current level hits  $I_{COMP}$ ,  $V_D$  will reach 0.1V and the output of OP2 will go low and cutoff the current flow from N4. The cutoff of N4 will make all of the cell current flow through N3 and elevate  $V_D$ . The elevated  $V_D$  reduces the effective bias on a 1T1R cell and reduces the electrical field to migrate oxygen atoms to capping layer. An  $I_b$  vs  $V_D$  curve to describe the operation is depicted in Fig. 30.1.2(b). N3 will keep conducting the cell current for an intentional delay time  $T_d$  (e.g. 500ns) after N4 branch is shutdown. This can keep the filament growing moderately during the  $T_d$  period; the data retention performance can be improved accordingly. With a conventional design [5], the current limiter can be implemented with a diode-connected device, but that could incur a ~0.5V voltage overhead on the footer of write path. Whereas, the presented LV-WCLS only requires a 0.1V overhead.

Figure 30.1.2(c) demonstrates the cell current distribution after SET ( $I_b$  distribution) with LV-WCLS. Shutting down the N4 branch limits RRAM from over SET, so that the distribution at the high bound is dramatically reduced. The intentional delay helps to improve filament robustness, so that the lower bound is pushed in. As a result, LV-WCLS improves both endurance and retention.

The presented SA scheme is depicted in Fig. 30.1.3; only two columns are shown. CSL is the common source-line of the two columns. BL0 and BL1 signify the bitlines of the two associated columns. This sensing scheme includes the core read SA, a bias generator (Bias\_gen), and BL/SL pre-chargers and equalizers. The read SA is comprised of a reference branch and a cell current branch. The reference current  $I_{ref}$  is generated by  $V_{CL}$  and  $V_{RD}$  biasing. RD1 and RDREF are the voltage signals developed on the two current branches. The voltage comparator (latch-SA) is used to amplify the voltage signal to  $V_{DD}$  or  $V_{SS}$ . For example, if the cell current is larger than  $I_{ref}$ , RD1 will be smaller than RDREF, and latch SA will output a logic-1.

Equalizers are required to virtually short the two terminals of the half-selected cells. This equivalently increases the capacitive load of the read port, CSL. In addition, both  $V_{CL}$  and  $V_{RD}$  will be yanked down due to the sink currents to ground. The jiggling  $V_{CL}$  and  $V_{RD}$  levels incur a longer stabilization time, as stable current flows are crucial for correct sensing. To overcome these factors, we used a pre-charge scheme. At standby, we pre-charge BL and SL to  $V_{RSL}$  (0.3V), the equilibrium voltage during read, which shortens the time to establish a stable cell current. At the early stage of a read cycle, the sensing node RD1 is pre-charged to  $V_{DD} - V_{lp}$ , which virtually equals the reference voltage RDREF. The virtually equalized RD1 and RDREF shorten the time for voltage signal development. The waveforms for a read operation are shown in Fig. 30.1.4. Also shown is the difference between the presented sensing scheme and a conventional SA[3,4]. With a conventional SA, RD1 is precharged to  $V_{DD}$ , and BL and SL are grounded at standby, resulting in bigger signal bounce, and thereby a longer voltage signal develop time. Fig. 30.1.5(a) shows a  $T_{acc}$  versus  $V_{RSL}$  shmoo plot.  $V_{RSL}$  significantly influences  $T_{acc}$ . For example, when  $V_{RSL}$  is 0.18V, the  $T_{acc}$  for reading a stored-0 is 21ns ( $\Delta I_{l0} \sim 1\mu\text{A}$ ), and 13ns ( $\Delta I_{l1} \sim 2.4\mu\text{A}$ ) for reading a stored-1. However, when  $V_{RSL}$  is 0.26V,  $T_{acc}$  reduces to 7ns for reading a stored-0 and to 9ns for a stored-1. As a result, 9ns read-access time can be achieved with an  $I_{RD}/I_{R1}$  separation of 3.4 $\mu\text{A}$ .

An 11Mb HfO<sub>x</sub> based RRAM macro was implemented in a 40nm TSMC technology, for which the die photograph is shown in Fig. 30.1.5(b). A common SL architecture is used to reduce SL resistance and to save macro area. However, this causes a larger capacitive load and impact read performance suffers when reading from CSL. Our SA shortens the required cell current stabilization time and decreases the read-access time by 58% compared to a grounded BL/SL pre-charge scheme. A LV-WCLS for SET operation is proven to improve both endurance and retention performance. Compared to a conventional current limiter, this scheme can reduce the write bias by ~400mV. The write features makes the RRAM macro achieve a robust read current window of 12 $\mu\text{A}$  after 1kC RAC at -40, 25 and 125°C as shown in Fig. 30.1.6.

#### Acknowledgment:

The authors would like to convey our appreciation to the physical layout team, the process RD team, the PE team, the RA team and the TE team of TSMC for their great support.

#### References:

- [1] X. Xue, et al., "A 0.13 $\mu\text{m}$  8Mb Logic-Based Cu<sub>rxmy</sub>O ReRAM with Self-Adaptive Operation for Yield Enhancement and Power Reduction," *JSSC*, vol. 48, no. 5, pp. 1315-1322, May 2013.
- [2] Y. L. Song, et al., "Reliability Significant Improvement of Resistive Switching Memory by Dynamic Self-Adaptive Write Method", *Symp. on VLSI Tech.*, pp. 102-103, 2013.
- [3] D. Gogl et al., "A 16-Mb MRAM featuring bootstrapped write drivers," *JSSC*, vol. 40, no. 4, pp. 902-908, April 2005.
- [4] J. Kim, et al., "A Novel Sensing Circuit for Deep Submicron Spin Transfer Torque MRAM (STT-MRAM)," *IEEE TVLSI*, vol. 20, no. 1, pp. 181-186, Jan. 2012.
- [5] A. Kawahara, et al., "Filament scaling forming technique and level-verify-write scheme with endurance over 10<sup>7</sup> cycles in ReRAM," *ISSCC*, pp. 220-221, 2013.



Figure 30.1.1: Voltages applied to an RRAM cell during basic operations: vacancy forming, reset and set.



Figure 30.1.2: (a) Schematic of LV-WCLS, (b) operation IV curve, and (c) I<sub>r</sub> distribution after SET.



Figure 30.1.3: SL pre-charged SA.



Figure 30.1.5: (a) Measured shmoos plot of T<sub>acc</sub> vs. V<sub>rsl</sub>. (b) Die photograph and summary table.



Figure 30.1.4: Read operation waveforms for proposed SA and a conventional one.



Figure 30.1.6: Current window after 1kC RAC at -40, 25 and 125°C.

### 30.2 A 1Mb 28nm STT-MRAM with 2.8ns Read Access Time at 1.2V VDD Using Single-Cap Offset-Cancelled Sense Amplifier and In-situ Self-Write-Termination

Qing Dong<sup>1,2</sup>, Zhehong Wang<sup>1</sup>, Jongyup Lim<sup>1</sup>, Yiqun Zhang<sup>1</sup>,  
Yi-Chun Shih<sup>3</sup>, Yu-Der Chih<sup>3</sup>, Jonathan Chang<sup>3</sup>, David Blaauw<sup>1</sup>,  
Dennis Sylvester<sup>1</sup>

<sup>1</sup>University of Michigan, Ann Arbor, MI; <sup>2</sup>TSMC, San Jose, CA

<sup>3</sup>TSMC, Hsinchu, Taiwan

1T1R spin-transfer-torque (STT) MRAM is a promising candidate for next-generation high-density embedded non-volatile memory [1-2]. However, 1T1R STT-MRAM suffers from limited sensing margin and high write power. As shown in Fig. 30.2.1(a), sense amplifier design is challenging due to the small difference (only 2x) between the high-resistance state ( $R_{AP}$ ) and the low-resistance state ( $R_P$ ), as well as  $R_{AP}$  degradation with increasing temperature. Moreover,  $R_P$  and  $R_{AP}$  resistance distributions shift with process variation, requiring a read reference ( $V_{ref}$ ) that tracks process. To improve the sensing margin, several offset-cancellation methods have been reported to reduce sense amplifier mismatch [3]. However, these methods use multiple capacitors and hence incur significant area overheads. To address this issue, we propose an offset-cancelled sense amplifier that uses only a single capacitor to significantly improve the sensing margin by more than 60%. A second design challenge for STT-MRAM stems from the high current needed to flip a cell during a write operation. For non-volatile memory applications with a 10-year retention time requirement, the write current can be as high as several hundred  $\mu$ A. However, as shown in Fig. 30.2.1(b), the required write time varies with the state change required (0→1 or 1→0), process variation, and temperature. As a result, a fixed write time that ensures successful write for all conditions wastes a significant energy for typical or average conditions. We propose an *in situ* write-self-termination method to reduce write energy in most scenarios. The sense amplifier is reconfigured to continuously monitor the write operation and automatically shuts off the write drivers when the state transition is detected, without an area or timing penalty. In addition, dual dummy columns are added in each array to provide read  $V_{ref}$  tracking of row-wise PVT variation. A 1Mb STT-MRAM was fabricated in 28nm technology, and achieves a 2.8ns read-access time at 25°C and 3.6ns at 120°C, respectively. With *in-situ* self-write-termination the write power is reduced by 47% with a 20ns write-access time at 25°C and by 60% at 120°C.

Figure 30.2.1(c) shows a detailed block diagram of the 1Mb STT-MRAM, which contains 8 arrays, each with 256×514 cells. To improve read performance and sensing margins, a constant-current sensing method is used during read operations. Figure 30.2.2(a) shows a constant read current  $I_{read}$ , which can be as high as 25% of the write current, is applied to the selected BL. A BL voltage develops quickly and is sensed by the offset-cancelled amplifier. No NMOS voltage clamp is employed in the read path to avoid read disturbances; instead, a PMOS current clamping header transistor is used to limit the BL current for this purpose. Since these PMOS headers also function as write drivers, their sizes are less constrained, alleviating the mismatch of  $I_{read}$  without any additional area overhead.

Each array has two extra columns to generate the reference voltage,  $V_{ref}$ , on ref BL. A reference cell in one column is programmed to  $R_P$  and the other to  $R_{AP}$ . When WL is asserted, the two reference columns are activated and by setting the ref BL current to  $2I_{read}$ , an intermediate reference voltage  $V_{ref} = 2I_{read}(R_P \parallel R_{AP})$  is generated and distributed to 16 two-stage sense amplifiers [4]. Row-wise PVT variations are tracked and compensated for as the reference cells are located in the same row as the selected cells.

For the second-stage sense amplifier, a single-capacitor offset cancellation method is proposed to minimize mismatch. Figure 30.2.2(c) shows simplified configurations of the design for each timing phase. During offset cancellation, the inputs of the inverters are connected to their outputs. The capacitor  $C_0$  samples the difference between the trip voltages of the two inverters ( $V_L - V_R$ ). Then, during BL sampling, this voltage is added to  $V_{ref}$  and the input of the right inverter becomes  $V_{ref} + V_L - V_R$  while that of the left inverter remains  $V_{ref}$ . This shifts the trip point during evaluation such that the mismatch of the two inverters is greatly reduced. Based on 10k Monte-Carlo simulations (Fig. 30.2.2(b)), the standard deviation of the input offset is reduced by more than 60% compared to a conventional sense amplifier with the same transistor sizes. Both offset cancellation and precharge phases are performed simultaneously with address decoding, avoiding any timing penalty for the proposed method. With only a single capacitor added, the area of the sense amplifier is reduced by approximately 15% compared to sense amplifiers using dual capacitors for offset cancellation.

Figure 30.2.3 shows the proposed *in-situ* self-termination write method. During write, hundreds of  $\mu$ A's are applied to the cell, with current flowing from BL to SL when writing a 1 and from SL to BL for 0. When the free magnetic layer in the STT bitcell flips, the resistance of the cell changes abruptly, which can be detected by observing the BL voltage. Since the resistance change and current direction both flip polarity between a write 1 and 0, the BL voltage drops in both cases when the write completes (Fig. 30.2.3(b)), simplifying write completion detection. The read sense amplifier is reconfigured as a continuous gate-connected voltage sense amplifier to detect the BL voltage drop as shown in Fig. 30.2.3(a). Since the write driver and detection circuit are shared with the read circuit, area overhead is negligible. Offset cancellation and precharge phases are overlapped with decoding, avoiding timing penalties. Once write completion is sensed, the 'stop' signal disables the write driver on that BL (Fig 30.2.3(c)). This method auto terminates the write when complete, saving write power and improving reliability. In addition, redundant writes are seamlessly detected and avoided which conventionally require a read-before-write [5] operation and incur a full read cycle overhead.

The proposed 1Mb STT-MRAM is fabricated in 28nm embedded MRAM technology. Figure 30.2.7 shows the die photo of the MRAM chip. The 1Mb MRAM macro occupies 0.214mm<sup>2</sup>. Figure 30.2.4(a) shows the read-failure vs access time for a single 128kb array, without any error correction. The array achieves a 2.8ns read-access time with an approximate error rate of  $10^{-5}$  at 25°C, and 3.6ns at 120°C. For an access time less than 2.8ns, higher error rates are observed with the self-generated  $V_{ref}$  because  $R_{BL}$  takes longer to stabilize due to driving 16 sense amps. However, read self-reference generation shows an error rate similar to an externally generated  $V_{ref}$  for access times greater than 2.8ns at room temperature. However, it exhibits a lower error rate at higher temperatures since it dynamically tracks the change in resistance with temperature. Figure 30.2.4(b) shows that the self-reference generation tracks array-to-array variation, improving the read failure rate from  $2 \times 10^{-5}$  to  $9 \times 10^{-6}$  compared with a fixed externally generated  $V_{ref}$  as measured across 8 arrays.

The measured Shmoo plot in Fig. 30.2.4(c) shows that the memory can be successfully read below 0.6V. Figure 30.2.4(d) shows the measured  $V_{DD\_min}$  across 10 dies for the proposed sense amplifier. Due to offset cancellation, the sense amplifier is robust at low supply voltages with an average  $V_{DD\_min}$  of 0.57V ( $\sigma=19\text{mV}$ ).

Figure 30.2.5(a) shows that the required write access time must be greater than 20ns in order to achieve a  $10^{-5}$  failure rate using a conventional, fixed write time. Since the required write time varies significantly across cells, a constant write time wastes substantial write power and the proposed self-write-termination saves 47% of write power for a  $10^{-5}$  error rate at 20ns, and 61% for a  $10^{-6}$  error rates at 30ns. When the temperature increases, the self-write-termination is more effective (Fig. 30.2.5(b)) because the write time decreases with temperature. Figure 30.2.6 compares this work to other MRAM work: compared with the listed references, the proposed MRAM achieves the best read-access time and power consumption within the smallest macro area.

#### Acknowledgements:

This work was supported by TSMC university joint development program and university shuttle program.

#### References:

- [1] H.-C. Yu, et al., "Cycling Endurance Optimization Scheme for 1Mb STT-MRAM in 40nm Technology," *ISSCC*, pp. 224-225, 2013.
- [2] K. Tsuchida, et al., "A 64Mb MRAM with Clamped-Reference and Adequate-Reference Schemes," *ISSCC*, pp. 258-259, 2010.
- [3] M.-F. Chang, et al., "An Offset-Tolerant Fast-Random-Read Current-Sampling-Based Sense Amplifier for Small-Cell-Current Nonvolatile Memory," *JSSC*, vol. 48, no. 3, pp. 864-877, March 2013.
- [4] D. Halupka, et al., "Negative-Resistance Read and Write Schemes for STT-MRAM in 0.13μm CMOS," *ISSCC*, pp. 256-257, 2010.
- [5] H. Noguchi, et al., "4Mb STT-MRAM-Based Cache with Memory-Access-Aware Power Optimization and Write-Verify-Write/Read-Modify-Write Scheme," *ISSCC*, pp. 132-133, 2016.
- [6] R. Nebashi, et al., "A 90nm 12ns 32Mb 2T1MTJ MRAM," *ISSCC*, pp. 462-463, 2009.
- [7] H. Noguchi, et al., "A 3.3ns-Access-Time 71.2μW/MHz 1Mb Embedded STT-MRAM Using Physically Eliminated Read-Disturb Scheme and Normally-Off Memory Architecture," *ISSCC*, pp. 136-137, 2015.
- [8] H. Noguchi, et al., "A 250-MHz 256b/I/O 1-Mb STT-MRAM with Advanced Perpendicular MTJ based Dual Cell for Nonvolatile Magnetic Caches to Reduce Active Power of Processors," *Symp. VLSI Circuits*, pp. 108-109, 2013.



Figure 30.2.1: Required write time and read margin vary with PVT, wasting write power and posing a challenge for sensing circuit design. Row-wise reference cells are used to generate a read-reference voltage that tracks row-wise PVT variation.



Figure 30.2.2: Constant-current based voltage sensing with a single-capacitor based offset cancellation. Compared with a conventional sense amplifier, input offset is reduced by >60%, significantly improving the sensing margin.



Figure 30.2.3: In-situ write-end detection and self-write termination using a continuous offset-cancelled sense amplifier reconfigured from the read-sense amplifier with no timing and area overhead.



Figure 30.2.4: Read operation measured results.



Figure 30.2.5: Write operation measured results.

|                         | This Work | ISSCC'13 [1] | ISSCC'10 [2] | ISSCC'10 [4]      | ISSCC'09 [6]      | ISSCC'15 [7]      | VLSI'13 [8] |
|-------------------------|-----------|--------------|--------------|-------------------|-------------------|-------------------|-------------|
| Technology (nm)         | 28        | 40           | 65           | 130               | 90                | 65                | 65          |
| Cell Type               | 1T1MTJ    | 1T1MTJ       | 1T1MTJ       | 2T1MTJ            | 2T2MTJ            | 2T2MTJ            |             |
| Target Application      | NVM       | NVM          | NVM          | LLC <sup>1)</sup> | LLC <sup>1)</sup> | LLC <sup>1)</sup> |             |
| Cell Area ( $\mu m^2$ ) | 75        | —            | —            | 85                | 327               | 169               | 107         |
| Capacity                | 1Mb       | 1Mb          | 64Mb         | 16kb              | 32Mb              | 1Mb               | 1Mb         |
| Macro Area ( $mm^2$ )   | 0.214     | 0.57         | 47.124       | —                 | 91.02             | 0.8196            | 0.628       |
| Power Supply (V)        | 1.2/1.8   | 1.1/2.5      | 1.2          | 1.2/3.3           | 1.5               | 1.2/0.9/0.4       | 1.2/0.9/0.4 |
| Word Length (bit)       | 16        | 32           | 16           | —                 | 32                | 256               | 256         |
| Read Speed (ns)         | 2.8       | 10           | 30           | 8                 | 12                | 3.3               | 4           |
| Write Speed (ns)        | 20        | —            | 30           | 10                | 12                | 3                 | 4           |
| Read Power (mW)         | 3.9       | —            | 7.8          | —                 | 60                | 21.6              | 17.8        |
| Write Power (mW)        | 3.6       | —            | 9.3          | 20                | 91                | 55.4              | 46.5        |
| Read Energy (pJ/bit)    | 0.7       | —            | 14.6         | —                 | 22.5              | 0.3               | 0.3         |
| Write Energy (pJ/bit)   | 4.5       | —            | 17.4         | —                 | 34.1              | 0.6               | 0.7         |

1) MTJ cell in LLC applications do not require high retention, allowing low switching current MTJs

Figure 30.2.6: Comparison table to other STT-MRAM work.



Figure 30.2.7: Die photo of 1Mb 28nm STT-MRAM.

### 30.3 A 28nm 32Kb Embedded 2T2MTJ STT-MRAM Macro with 1.3ns Read-Access Time for Fast and Reliable Read Applications

Tzu-Hsien Yang<sup>1,2</sup>, Kai-Xiang Li<sup>1</sup>, Yen-Ning Chiang<sup>1</sup>, Wei-Yu Lin<sup>1</sup>, Huan-Ting Lin<sup>1</sup>, Meng-Fan Chang<sup>1</sup>

<sup>1</sup>National Tsing Hua University, Hsinchu, Taiwan; <sup>2</sup>TSMC, Hsinchu, Taiwan

Many IoT and wearable devices require an on-chip small-to-mid-capacity nonvolatile memory (NVM) with a fast read-access time ( $T_{AC}$ ) and reliable read operations: for applications including data-logging, configurable look-up tables (LUT), eFuse, and physically unclonable functions (PUF). STT-MRAM [1-4] is a good candidate for these applications due to its fast write speed, low-voltage write, and high endurance. However, STT-MRAM suffers from a small tunnel magnetoresistance ratio (TMR:  $(R_{AP}-R_P)/R_P$ ) between the cell resistance of parallel (P,  $R_P$ ) and anti-parallel (AP,  $R_{AP}$ ) states [1-6]. Moreover, the read-disturb behavior of STT-MRAM cells is sensitive to the BL read voltage ( $V_{BL\_RD}$ ) and the stress/read time. Compact 1T1MTJ arrays are suitable for high-density applications [5-6]; however, they use a power-hungry current-mode read scheme with a slow read speed due to the small RSM. Researchers have proposed 2T2MTJ (Fig. 30.3.1) arrays [1-4] with differential bitlines (BL and BLB) and a voltage-mode read scheme, with an enlarged RSM ( $V_{RSM}$ ), for fast, low-power read operations.  $V_{RSM}$  refers to the voltage difference between BL ( $V_{BL}$ ) and BLB ( $V_{BLB}$ ). 2T2MTJ STT-MRAM read operations still face the following challenges: (1)  $V_{BL}$  and  $V_{BLB}$  both drop from  $V_{BL\_RD}$  to OV quite quickly due to the large cell read current ( $I_P$  and  $I_{AP}$ ) or low R-value in both  $R_P$  and  $R_{AP}$ , resulting in small sensing window ( $T_{SMW}$ ), which is the period when  $V_{RSM}>\text{offset}$ ; (2) the maximum  $V_{RSM}$  ( $V_{RSM\_MAX}$ ) occurs at different times ( $t_{RSM\_MAX}$ ) for different cells due to TMR ( $R_{AP}/R_P$ ) variation; and (3) a degraded  $V_{RSM}$  due to the use of a low  $V_{BL\_RD}$  to avoid read disturbs for high data-reliability applications. (1) and (2) lead to a decrease in  $V_{RSM}$  after reaching its peak ( $V_{RSM\_MAX}$ ), despite an increase in BL development time ( $t_{BL}$ ). When using a conventional voltage-mode sense amplifier (CNV-VSA) with a common activated (SAEN=1) timing ( $t_{SAEN}$ ) under the effects of (1)-(3), the signal to be amplified ( $\Delta V_{IN}<V_{RSM\_MAX}$ ) is subject to degradation at the VSA's differential inputs, resulting in a sensing failure at a low  $V_{BL\_RD}$ .

A continuous-recording-and-enhancement (CRE) VSA, shown in Fig. 30.3.2, is proposed, which can tolerate a small TMR-Ratio to achieve a fast  $T_{AC}$  at a low  $V_{BL\_RD}$ , and achieves a reduced read power. The fabricated 28nm 32kb 2T2MTJ STT-MRAM macro achieves the fastest read-access time, 1.3ns, among recently reported STT-MRAMs.

The CRE-VSA employs a CRE unit with a conventional latch-based VSA (LVSA) for margin enhancement and offset suppression, due to  $V_{TH}$  variation in the input stage. This is accomplished by continuously recording the  $V_{RSM}$  over time and increasing  $\Delta V_{IN}$  as  $t_{BL}$  increases, despite a corresponding decrease in  $V_{RSM}$  as  $t_{BL}$  increases.

The detailed operation of the proposed CRE-VSA is shown in Fig. 30.3.3 and 30.3.4: each CRE unit includes a pair of capacitors (C1 & C2), cross-coupled transistors (M3 & M4), precharge transistors (M1 & M2) and reset transistors (M5 & M6). The CRE-VSA operates in three phases (P1-P3). In the standby state – starting at the end of a read operation to the beginning of the next – M5 and M6 are on to reset  $V_{X1}$  and  $V_{X2}$  to OV, while PRE=1 (M1 & M2 are off) and  $V_{IN1}$  and  $V_{IN2}$  is OV. In P1 ( $V_{TH}$  sampling and BL precharge), M5 and M6 are off, and M1 and M2 are on to precharge  $V_{IN1}$  and  $V_{IN2}$  to  $V_{DD}$ , such that  $V_{X1}=V_{DD}-V_{TH3}$  and  $V_{X2}=V_{DD}-V_{TH4}$ . This results in M3 and M4 biased at the edge of on-off boundary. At the same time, the  $V_{BL}$  and  $V_{BLB}$  are precharged to  $V_{BL\_RD}$  – similar to a conventional read scheme – so that P1 has no timing overhead. In P2 (BL development and continuous recording and margin enhancement), M1 and M2 are off, such that IN1 and IN2 are floating. When the WL is on, BL and BLB are discharged by the cell read current ( $I_P$  &  $I_{AP}$ ). The voltage swing ( $V_S$ ) on BL and BLB ( $V_{S\_BL}=V_{BL\_RD}-V_{BL}$  &  $V_{S\_BLB}=V_{BL\_RD}-V_{BLB}$ ) is coupled to nodes X1 and X2 via C1 and C2, which results in  $V_{X1}=V_{DD}-V_{TH3}-V_{S\_BL}$  and  $V_{X2}=V_{DD}-V_{TH4}-V_{S\_BLB}$ . M3 and M4 are then turned on, such that the over-drive voltage of M3 and M4 becomes  $V_{OD3}=V_{IN2}-V_{DD}+V_{S\_BL}$  and  $V_{OD4}=V_{IN1}-V_{DD}+V_{S\_BLB}$ , which is independent of  $V_{TH}$  variation.  $V_{IN1}$  and  $V_{IN2}$  then begin to drop in accordance with the discharge drain current ( $I_{D3}$  and  $I_{D4}$ ) or the charge sharing current between X1-IN1 and X2-IN2, as controlled by M3 and M4 ( $V_{OD3}$  &  $V_{OD4}$ ). With an increase in  $t_{BL}$ , the  $V_{S\_BL}$  and  $V_{S\_BLB}$  of each time-step contributes to a differential voltage drop ( $\Delta V_S$ ) at X1 and

X2, which continues to affect  $V_{IN1}$  and  $V_{IN2}$ ,  $V_{OD3}$  and  $V_{OD4}$ , and  $I_{D3}$  and  $I_{D4}$ . The cross-coupled feedback between M3 and M4 ( $V_{OD3}$  &  $V_{OD4}$ ) causes an increase in the drain current ( $V_{DD}>V_{RSM}>0$ ) of the CRE path (C1-M3-IN1 & C2-M4-IN2) with a larger  $V_S$ . At the same time, the other path (with a smaller  $V_S$ ) sees a decrease in drain current, which approaches zero when M3 and M4 move into the sub-threshold region ( $V_{DD}<0$ ). As a result, the voltage difference between  $V_{IN1}$  and  $V_{IN2}$  ( $\Delta V_{IN}$ ) is k-times  $V_{RSM}$ . Moreover, an increase in  $t_{BL}$  leads to a constant increase in  $\Delta V_{IN}$ , despite a drop in  $V_{RSM}$  after it reaches a peak. This trend differs from that of conventional VSA.

When the BL starts ( $t_1$ ) to develop for read-1 case ( $V_{S\_BL}>V_{S\_BLB}$  or  $V_{BL}>V_{BLB}$ ) the fact that  $V_{X1}<V_{X2}$  results in  $V_{IN1}<V_{IN2}$  and  $V_{OD3}>V_{OD4}$ . This causes  $I_{D3}>I_{D4}$ , which moderates the rate at which  $V_{IN2}$  drops. Later (@ $t_2$ ), the voltage drop at IN2 ( $\Delta V_{IN2}=V_{DD}-V_{IN2}$ ) is significantly smaller than  $V_{S\_BL}$ , while the voltage drop at IN1 ( $\Delta V_{IN1}=V_{DD}-V_{IN1}$ ) is larger than  $V_{S\_BLB}$ . As a result, M3 is in the saturation region and M4 is in the sub-threshold region. C1 continuously records/couples  $V_{S\_BL}$  to X1 to lower  $V_{IN1}$ , whereas  $V_{IN2}$  undergoes no voltage drop – keeping M4 in cutoff – despite a continually increasing  $V_{S\_BLB}$ .  $V_{BLB}$  approaches OV and  $V_{X2}$  continues dropping. Furthermore,  $\Delta V_{IN}$  keeps increasing from  $t_1$  to reach a maximum value ( $\Delta V_{IN\_MAX}$ ) when  $V_{BL}$  reaches OV. Then,  $\Delta V_{IN}$  remains at  $\Delta V_{IN\_MAX}$  even as  $V_{RSM}$  decreases:  $V_{BLB}$  keeps dropping toward OV.

Figure 30.3.5 presents the performance of the CRE-VSA. Unlike a conventional VSA, the  $\Delta V_{IN}$  of the CRE-VSA increases with an increase in  $t_{BL}$ . The maximum ( $\Delta V_{IN\_MAX}$ ) of the CRE-VSA is 5.85× larger than that of a conventional VSA. The small offset and enhanced margin of the CRE-VSA provides tolerance for a minimum TMR-ratio ( $TMR_{MIN}$ ) that is 5.6 to 8× smaller than that of a conventional VSA across various resistance values. With the same sensing yield, the CRE scheme can tolerate a  $V_{BL\_RD}$  that is 2× lower than that of a conventional VSA, which reduces the read-disturb rate by more than 10<sup>6</sup>×. At different BL lengths, this scheme allows for a 1.5-1.86× shorter  $T_{AC}$  than a conventional VSA, when  $TMR_{MIN}=100\%$  and  $V_{BL\_RD}=0.3V$ . A lower  $V_{BL\_RD}$  reduced the read energy of the CRE-VSA by 34%, compared to a conventional VSA with the same  $TMR_{MIN}$  sensing yield.

A 28nm 32kb 2T2MTJ macro is fabricated with test-modes using a CRE-VSA and a conventional VSA, using similarly skewed transistor sizes in the latch circuit. Figure 30.3.6 presents the measurement results. For a 32kb macro with limited TMR-ratio variation across cells, CRE achieves a 1.3ns macro-level read-access time ( $T_{AC\_MACRO}$ ) at a  $V_{BL\_RD}=0.3V$ , which is 1.43× faster than a conventional VSA where  $T_{AC\_MACRO}$  is 1.86ns. A larger macro size and a higher TMR variation is expected to increase the  $T_{AC}$  reduction ratio. Moreover, the CRE scheme works at  $V_{BL\_RD}=140mV$ , which is 2× lower than a conventional VSA ( $V_{BL\_RD}=280mV$ ) at room temperature. At higher temperatures (e.g. 75°C), the improvement in minimum  $V_{BL\_RD}$  increases to 190mV. Figure 30.3.7 presents the die photo and a chip summary.

#### Acknowledgement:

The authors would like to thank the support from NVM-DTP of TSMC, TSMC-JDP and MOST-Taiwan.

#### References:

- [1] H. Noguchi, et al., "4Mb STT-MRAM-based cache with memory-access-aware power optimization and write-verify-write/read-modify-write scheme," *ISSCC*, pp.132-133, 2016.
- [2] H. Noguchi, et al., "A 3.3ns-access-time 71.2μW/MHz 1Mb embedded STT-MRAM using physically eliminated read-disturb scheme and normally-off memory architecture," *ISSCC*, pp.136-137, 2015.
- [3] H. Noguchi, et al., "Highly reliable and low-power nonvolatile cache memory with advanced perpendicular STT-MRAM for high-performance CPU," *Symp. VLSI Circuits*, 2014.
- [4] H. Noguchi, et al., "A 250-MHz 256b-I/O 1-Mb STT-MRAM with advanced perpendicular MTJ based dual cell for nonvolatile magnetic caches to reduce active power of processors," *Symp. VLSI Circuits*, pp. 108-109, 2013.
- [5] K. Rho, et al., "A 4Gb LPDDR2 STT-MRAM with compact 9F2 1T1MTJ cell and hierarchical bitline architecture," *ISSCC*, pp. 396-397, 2017.
- [6] C. Kim, et al., "A covalent-bonded cross-coupled current-mode sense amplifier for STT-MRAM with 1T1MTJ common source-line structure array," *ISSCC*, pp. 134-135, 2015.
- [7] M.-C. Shih, et al., "Reliability study of perpendicular STT-MRAM as emerging embedded memory qualified for reflow soldering at 260°C," *IEEE Symp. VLSI Tech.*, 2016.



Figure 30.3.1: 2T2MTJ applications and the read challenge.



Figure 30.3.2: Concept and circuit of proposed CRE-VSA.



Figure 30.3.3: Continuous-recording-and-enhancement voltage sense amplifier operation - part 1.



Figure 30.3.4: Continuous-recording-and-enhancement voltage sense amplifier operation - part 2.



Figure 30.3.5: Continuous-recording-and-enhancement voltage sense amplifier performance summary.



Figure 30.3.6: Measured results.



Figure 30.3.7: Die photo and summary table.

### 30.4 A 20ns-Write 45ns-Read and $10^{14}$ -Cycle Endurance Memory Module Composed of 60nm Crystalline Oxide Semiconductor Transistors

Shuhei Maeda<sup>1</sup>, Satoru Ohshita<sup>1</sup>, Kazuma Furutani<sup>1</sup>, Yuto Yakubo<sup>1</sup>, Takahiko Ishizu<sup>1</sup>, Tomoaki Atsumi<sup>1</sup>, Yoshinori Ando<sup>1</sup>, Daisuke Matsubayashi<sup>1</sup>, Kiyoshi Kato<sup>1</sup>, Takashi Okuda<sup>1</sup>, Masahiro Fujita<sup>2</sup>, Shunpei Yamazaki<sup>1</sup>

<sup>1</sup>Semiconductor Energy Laboratory, Atsugi, Japan  
<sup>2</sup>University of Tokyo, Tokyo, Japan

Development of LSI targeting artificial intelligence (AI) has accelerated, some chips have been used and are commercially available in a number of applications. LSI capable of performing arithmetic operation for deep learning, etc., at low power and high speed is crucial for achieving more sophisticated AI. Power consumption is increasing significantly owing particularly to the practical use of AI, and power reduction techniques are urgently necessary.

One way of reducing LSI power consumption is by power gating, where the LSI is powered off during standby operations. Embedded memories capable of retaining data for a long time are required for power gating [1-4]. This embedded memory is also used for arithmetic operations for deep learning, where the frequency of write and read operations increases; hence, the embedded memory also requires high endurance. Moreover, in executing deep learning processing, an arithmetic circuit needs to obtain data from an external memory. Accordingly, to achieve low-power and high-speed LSI for deep learning the delay time and the energy for charging/discharging the I/Os needs to be reduced [5].

This paper presents a low-power and high-speed embedded LSI memory for deep learning, composed of crystalline oxide semiconductor FETs (OSFETs), which are compatible with a CMOS logic process. The fabricated memory module has 1kb density in a 60nm OSFET process and demonstrates a write/read time of 20/45ns and a write/read energy of 97.9/123.6pJ. In addition, the memory cell achieves a  $10^{14}$ -cycle endurance. Therefore, it can operate as an embedded memory module for deep learning without limiting the large number of rewrite operations required.

Data written to a storage capacitor, via OSFET, is retained for a long time due to the OSFET's ultralow off-state current [6]; therefore this configuration works as a memory cell. The OSFET is used to control write access, as such OSFET memories have high endurance. The OSFET is formed in a thin-film process independent of the CMOS process and thus can be stacked on top of CMOS logic circuits. A low-power normally-off CPU and a low-power memory have been presented in [3,4], in which logic circuits are implemented in CMOS and the memory is implemented using OSFETs. The oxide semiconductor (OS) memory module presented in this paper does not require any CMOS circuits and can only operate with OSFETs. As shown in Fig. 30.4.1, and by the architecture presented in [5], the data delay time due to bus transactions can be reduced by placing the memory close to each arithmetic circuit. The OS memory module can be embedded in an arithmetic processing unit. Hence, the OS memory module is the key to reducing area, since neither an MTJ device nor the ferroelectric device used in MRAM and FeRAM can form a memory driver circuit in and of itself.

OS memory module driver circuits for the memory cell array are formed with dynamic logic circuits as OSFETs have a single conductivity type. OSFETs suppress charge leakage from dynamic nodes because of their ultralow off-state current; therefore, a keeper circuit, or similar circuits, are not required. Figure 30.4.2 shows the configuration of a four-stage shift register fabricated with OSFET dynamic logic circuits, and a waveform showing its operation. When an input pulse is shifted to the third-stage register, the clock and power are shutdown for 1s and then restarted. Then, the output from the fourth-stage register is measured and the correctness of the operations is confirmed.

Figure 30.4.3 shows the configuration of the fabricated memory module. It is composed of a memory cell array, a row decoder, a write circuit, and a readout circuit. A readout transistor (MR) is provided with a backgate terminal, and the threshold voltage  $V_t$  of the OSFET is lowered to decrease read time. The use of a dynamic logic circuit yields a small component count, compared to a static CMOS logic circuit: a component comparison table for 32 wordlines and 32 bitlines is

shown in Fig. 30.4.3. Also shown is an operational timing diagram for the memory module. During T1 for a write operation, all WBLs and WWLs (Fig. 30.4.4) are fixed to  $V_{SS}$ . Next, during T2, the addressed WWL is selected, and data is written to the memory cells for one word (32b) via the WBLs. During T1 for a read operation, RBLs are fixed to  $V_{CH}$  and the output value of the readout circuit is  $V_{DD}$ . Then, during T2, the addressed RWL is selected and the data in the memory cells are readout through the RBLs; the output value of the readout circuit depends on the RBL voltage.

Figure 30.4.4 shows the 3T1C memory cell. A transistor M1 gates write access to a storage capacitor  $C_S$ . Data readout is accomplished via a transistor M2, which also has a backgate terminal that allows its  $V_t$  to be lowered to decrease the readout time. A selection transistor M3 is used to prevent unwanted data readout from unselected cells, due to M2's lowered  $V_t$ . The timing diagram in Fig. 30.4.4 shows the sequence of events during a read operation. First, reset transistor M4 is turned on to drive RBL to  $V_{CH}$ . Then, M4 is turned off and RWL is driven high. If a logic-1 is written to storage node SN, M2 is turned on and RBL is charged. Here, the electrode of  $C_S$  opposite to the M2 gate is connected to RBL. Hence, the SN voltage is increased by capacitive coupling, and the gate voltage of M2 rises thereby decreasing read time. Simulation results, as shown in Fig. 30.4.4, indicate that read-access time for reading a logic-1 is reduced by 33% by bootstrapping RBL.

The test chip is fabricated in a 60nm OSFET process. Figure 30.4.5 shows Shmoo plots for the write pulse width and the read-access time at room temperature. The write pulse width is 20ns and the read-access time is 45ns using a 3.3V supply.

Figure 30.4.6 shows the endurance of the memory cell. A logic-1 and a logic-0 were repeatedly written into a 1b 2T1C test cell (shown in Fig. 30.4.6), M2's  $I_d$ - $V_g$  curves were measured at regular intervals for a read operation. The threshold voltage  $V_t$  of M2 was calculated with the square root extrapolation method from each  $I_d$ - $V_g$  curve. The  $V_t$  difference when reading a logic-0 and a logic-1 is approximately 2.5V after  $10^{14}$  write cycles, which demonstrates the high endurance of an OSFET based memory cell.

Figure 30.4.7 shows the test chip micrograph and a comparison table to other emerging memories. At room temperature, the standby power of the test chip is 9.9nW, while a write operation requires 97.9μW/MHz and a read operation requires 258.6μW/MHz. Note that the test chip has a larger readout load capacitance than that typical of other embedded memories, since the output buffer on this test chip has an input capacitance of 5pF to allow for measurements. When a more typical readout load capacitance of 10fF is used, the read power is 123.6μW/MHz. Assuming the read power is 123.6μW/MHz and the standby power is 9.9nW, a 1Mb memory array with this memory module will spend 133.7μW/MHz on read operations; such memory consumes less power than other memories.

#### References:

- [1] S. Barting, et al., "An 8MHz 75μA/MHz Zero-Leakage Non-Volatile Logic-Based Cortex-M0 MCU SoC Exhibiting 100% Digital State Retention at VDD=0V with <400ns Wakeup and Sleep Transitions," ISSCC, pp. 432-433, 2013.
- [2] Y. Lu, et al., "Fully Functional Perpendicular STT-MRAM Macro Embedded in 40 nm Logic for Energy-efficient IOT Applications," IEDM, pp. 660-663, 2015.
- [3] T. Onuki, et al., "Embedded memory and ARM Cortex-M0 core using 60-nm C-axis aligned crystalline indium-gallium-zinc oxide FET integrated with 65-nm Si CMOS," IEEE Symp. VLSI Circuits, pp. 124-125, 2016.
- [4] T. Ishizu, et al., "A 140 MHz 1 Mbit 2T1C Gain-Cell Memory with 60-nm Indium-Gallium-Zinc Oxid Transistor Embedded Into 65-nm CMOS Logic Process Technology," IEEE Symp. VLSI Circuits, pp. 162-163, 2017.
- [5] K. Bong, et al., "A 0.62mW Ultra-Low-Power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-On Haar-Like Face Detector," ISSCC, pp. 248-249, 2017.
- [6] H. Inoue, et al., "Nonvolatile Memory with Extremely Low-Leakage Indium-Gallium-Zinc-Oxide Thin-Film Transistor," JSSC, vol. 47, no. 9, pp. 2258-2265, Sept. 2012.



Figure 30.4.1: Embedded memory module composed of only crystalline oxide semiconductor FETs (OSFETs) for deep neural networks.



Figure 30.4.2: Configuration of four-stage shift register designed with OSFET dynamic logic circuits and waveform showing its operation.



Figure 30.4.3: Chip configuration and its operation timing diagram.



Figure 30.4.4: Memory cell configuration, its operation timing diagram, and the impact of bootstrap effect on read access time.



Figure 30.4.5: Shmoo plots for 1kb memory module.



Figure 30.4.6: Endurance of the memory cell.



**Figure 30.4.7: Test chip micrograph and comparison table of emerging embedded memory.**

# Session 31 Overview:

## *Computation in Memory for Machine Learning*

### TECHNOLOGY DIRECTIONS AND MEMORY SUBCOMMITTEES



**Session Chair:**  
***Naveen Verma***  
*Princeton University, Princeton, NJ*



**Associate Chair:**  
***Fatih Hamzaoglu***  
*Intel, Hillsboro, OR*

**Subcommittee Chair: *Makoto Nagata***, Kobe University, Kobe, Japan, Technology Directions

**Subcommittee Chair: *Leland Chang***, IBM, Yorktown Heights, NY, Memory

Many state-of-the-art systems for machine learning are limited by memory in terms of the energy they require and the performance they can achieve. This session explores how this bottleneck can be overcome by emerging architectures that perform computation inside the memory array. This necessitates unconventional, typically mixed-signal, circuits for computation, which exploit the statistical nature of machine-learning applications to achieve high algorithmic performance with substantial energy and throughput gains. Further, the architectures serve as a driver for emerging memory technologies, exploiting the high-density and nonvolatility these offer towards increased scale and efficiency of computation. The innovative papers in this session provide concrete demonstrations of this promise, by going beyond conventional architectures.



3:15 PM

**31.1 Conv-RAM: An Energy-Efficient SRAM with Embedded Convolution Computation for Low-Power CNN-Based Machine Learning Applications**
*A. Biswas*, Massachusetts Institute of Technology, Cambridge, MA

In Paper 31.1, MIT describes a compute-in-memory structure by performing multiplication between an activation and 1-b weight on a bit line, and accumulation through analog-to-digital conversion of charge across bit lines. Mapping two convolutional layers to the accelerator, an accuracy of 99% is achieved on a subset of the MNIST dataset, at an energy efficiency of 28.1TOPS/W.



3:45 PM

**31.2 A 42pJ/Decision 3.12TOPS/W Robust In-Memory Machine Learning Classifier with On-Chip Training**
*S. K. Gonugondla*, University of Illinois, Urbana-Champaign, IL

In Paper 31.2, UIUC describes a compute-in-memory architecture that simultaneously accesses multiple weights in memory to perform 8b multiplication on the bit lines, and introduces on-chip training, via stochastic gradient decent, to mitigate non-idealities in mixed-signal compute. An accuracy of 96% is achieved on the MIT-CBCL dataset, at an energy efficiency of 3.125TOPS/W



4:15 PM

**31.3 Brain-Inspired Computing Exploiting Carbon Nanotube FETs and Resistive RAM: Hyperdimensional Computing Case Study**
*T. F. Wu*, Stanford University, Stanford, CA

In Paper 31.3, Stanford/UCB/MIT demonstrate a brain-inspired hyperdimensional (HD) computing nanosystem to recognize languages and sentences from minimal training data. The paper uses 3D integration of CNTFETs and RRAM cells, and measurements show that 21 European languages can be classified with 98% accuracy from >20,000 sentences.



4:45 PM

**31.4 A 65nm 1Mb Nonvolatile Computing-in-Memory ReRAM Macro with Sub-16ns Multiply-and-Accumulate for Binary DNN AI Edge Processors**
*W-H. Chen*, National Tsing Hua University, Hsinchu, Taiwan

In Paper 31.4, National Tsing-Hua University implements multiply-and-accumulate operations using a 1Mb RRAM array for a Binary DNN in edge processors. The paper proposes an offset-current-suppressing sense amp and input-aware, dynamic-reference current generation to overcome sense-margin challenges. Silicon measurements show successful operation with sub-16ns access time.



5:00 PM

**31.5 A 65nm 4Kb Algorithm-Dependent Computing-in-Memory SRAM Unit-Macro with 2.3ns and 55.8TOPS/W Fully Parallel Product-Sum Operation for Binary DNN Edge Processors**
*W-S. Khwa*, National Tsing Hua University, Hsinchu, Taiwan and TSMC, Hsinchu, Taiwan

In Paper 31.5, National Tsing-Hua University demonstrates multiply-and-accumulate operations using a 4kb SRAM for fully-connected neural networks in edge processors. The paper overcomes the challenges of excessive current, sense-amplifier offset, and sensing  $V_{ref}$  optimization, arising due to simultaneous activation of multiple word lines. Sub-3ns access speed is achieved with simulated 97.5% MNIST accuracy.

### 31.1 Conv-RAM: An Energy-Efficient SRAM with Embedded Convolution Computation for Low-Power CNN-Based Machine Learning Applications

Avishek Biswas, Anantha P. Chandrakasan

Massachusetts Institute of Technology, Cambridge, MA

Convolutional neural networks (CNN) provide state-of-the-art results in a wide variety of machine learning (ML) applications, ranging from image classification to speech recognition. However, they are very computationally intensive and require huge amounts of storage. Recent work strived towards reducing the size of the CNNs: [1] proposes a binary-weight-network (BWN), where the filter weights ( $w_i$ 's) are  $\pm 1$  (with a common scaling factor per filter:  $\alpha$ ). This leads to a significant reduction in the amount of storage required for the  $w_i$ 's, making it possible to store them entirely on-chip. However, in a conventional all-digital implementation [2, 3], reading the  $w_i$ 's and the partial sums from the embedded SRAMs require a lot of data movement per computation, which is energy-hungry. To reduce data-movement, and associated energy, we present an SRAM-embedded convolution architecture (Fig. 31.1.1), which does not require reading the  $w_i$ 's explicitly from the memory. Prior work on embedded ML classifiers have focused on 1b outputs [4] or a small number of output classes [5], both of which are not sufficient for CNNs. This work uses 7b inputs/outputs, which is sufficient to maintain good accuracy for most of the popular CNNs [1]. The convolution operation is implemented as voltage averaging (Fig. 31.1.1), since the  $w_i$ 's are binary, while the averaging factor (1/N) implements the weight-coefficient  $\alpha$  (with a new scaling factor, M, implemented off-chip).

Figure 31.1.2 shows the overall architecture of the 256×64 conv-SRAM (CSRAM) array. It is divided into 16 local arrays, each with 16 rows to reduce the area overhead of the ADCs and the local analog multiply-and-average (MAV<sub>a</sub>) circuits. Each local array stores the binary weights ( $w_i$ 's) in the 10T bit-cells (logic-0 for +1 and logic-1 for -1) for each individual 3D filter in a conv-layer. Hence, each local array has a dedicated ADC to compute its partial convolution output ( $Y_{OUT}$ ). The input-feature-map values ( $X_{IN}$ ) are fed into column-wise DACs (GBL\_DAC), which pre-charge the global read bit-lines (GRBL) and the local bit-lines (LBL) to an analog voltage ( $V_a$ ) that is proportional to the digital  $X_{IN}$  code. The GRBLs are shared by all of the local arrays, since in CNNs each input is shared/processed in parallel by multiple filters. Figure 31.1.3 shows the schematic of the proposed GBL\_DAC circuit. It consists of a cascaded PMOS constant current source. The GRBL is charged with this current for a duration  $t_{ON}$ , which is directly proportional to the  $X_{IN}$  code. For better  $t_{ON}$  vs  $X_{IN}$  linearity there should only be one ON pulse for every code to avoid multiple charging phases. This is impossible to generate using signals with binary-weighted pulse-widths. Hence, we propose an implementation where the 3 MSBs of  $X_{IN}$  are used to select (using  $TD_{56}$ ) the ON pulse-width for the first-half of charging ( $TD_{56}$  is high) and the 3 LSBs for the second-half ( $TD_{56}$  is low). An 8:1 mux with 8 timing signals is shared during both phases to reduce the area overhead and the signal routing. As such, it is possible to generate a single ON pulse for each  $X_{IN}$  code, as shown for codes 63 and 24 in Fig. 31.1.3. This DAC architecture has better mismatch and linearity than the binary-weighted PMOS charging DACs [4], since the same PMOS stack is used to charge GRBL for all input codes. Furthermore, the pulse-widths of the timing signals typically have less variation compared to those arising from PMOS  $V_t$  mismatch.

After the DAC pre-charge phase, the  $w_i$ 's in a local array are evaluated locally by turning on a RWL, as shown in Fig. 31.1.4. One of the local bit-lines (LBLF or LBLT) will be discharged to ground depending on the stored  $w_i$  (0 or 1). This is done in parallel for all 16 local arrays. Next, the RWL's are turned off and the appropriate local bit-lines are shorted together horizontally to evaluate the average via the local MAV<sub>a</sub> circuit. MAV<sub>a</sub> passes the voltages of the LBLT and LBLF to the positive ( $V_{p-AVG}$ ) and negative ( $V_{n-AVG}$ ) voltage rails, depending on the sign of the input  $X_{IN}$  ( $EN_p$  is ON for  $X_{IN}>0$ ,  $EN_n$  is ON for  $X_{IN}<0$ ). The difference between  $V_{p-AVG}$  and  $V_{n-AVG}$  is fed to a charge-sharing based ADC (CSH\_ADC) to get the digital value of the computation ( $Y_{OUT}$ ). Algorithm simulations (Fig. 31.1.1) show that  $Y_{OUT}$  has a peak distribution around 0 and is typically limited to  $\pm 7$ , for a full-scale input of  $\pm 31$ . Hence, a serial integrating ADC architecture is more applicable than other area-intensive (e.g. SAR) or more power-hungry (e.g. flash) ADCs. A PMOS-input sense-amplifier (SA) is used to compare  $V_{p-AVG}$  and  $V_{n-AVG}$ , and its output is fed to the ADC logic. The first comparison determines the sign of  $Y_{OUT}$ , then capacitive

charge-sharing is used to integrate the lower of the 2 voltage rails with a reference local column that replicates the local bit-line capacitance. This process continues until the voltage of the rail being integrated exceeds the other one, at which point the SA output flips. This signals conversion completion and no further SA\_EN pulses are generated for the SA. Figure 31.1.4 shows the waveforms for a typical operation cycle. To reduce the effect of SA offset on  $Y_{OUT}$  value, a multiplexer is used at the input of the SA to flip the inputs on alternate cycles.

The 256×64 CSRAM array is implemented in a 65nm LP-CMOS process. Figure 31.1.5 shows the measured GBL\_DAC results, which is used in its 5b mode by setting the LSB of  $X_{IN}$  to 0. To estimate the DAC analog output voltage ( $V_a$ ),  $V_{GRBL}$  for the 64 columns are compared to an external  $V_{ref}$  by column-wise SA's, used in the SRAM's global read circuit. For each  $X_{IN}$ , the  $V_{ref}$  at which more than 50% of the SA outputs flip is chosen as an average estimate of  $V_a$ . An initial one-time calibration is needed to set  $V_a = 1V$  for  $X_{IN} = 31$  (max. input code). As seen in Fig. 31.1.5, there is good linearity in the DAC transfer function with DNL < 1LSB. Figure 31.1.5 also shows the overall system transfer function, consisting of the GBL\_DAC, MAV<sub>a</sub> and CSH\_ADC circuits. For this experiment, same code is provided to all  $X_{IN}$ 's, all  $w_i$ 's have the same value, and the  $Y_{OUT}$  outputs are observed. The measurement results show good linearity in the overall transfer function and low variation in the  $Y_{OUT}$  values: mainly because variation in BL capacitance (used for averaging and CSH\_ADC) is much lower than transistor  $V_t$  variation. SA offset cancellation further helps to reduce  $Y_{OUT}$  variation. It can be also seen from Fig. 31.1.5 that the energy/ADC scales linearly with the output code, which is expected for an integrating ADC topology.

To demonstrate the functionality for a real CNN architecture, the MNIST handwritten digit recognition dataset is used with the LeNet-5 CNN. 100 test images are run through the 2 convolutional and 2 fully-connected layers (implemented by the CSRAM array). We achieve a classification error rate of 1% after the first 2 convolutional layers and 4% after all the 4 layers, which demonstrates the ability of the CSRAM architecture to compute convolutions. The distribution of  $Y_{OUT}$  in Fig. 31.1.6 for the first 2 computation-intensive convolutional layers (C1, C3) show that both layers have a mean of ~1LSB, justifying the use of a serial ADC topology. Figure 31.1.6 also shows the overall computational energy annotated with the different components. Layers C1 and C3 consume 4.23pJ and 3.56pJ per convolution, computing 25 and 50 MAV operations in each cycle respectively. Layer C3 achieves the best energy efficiency of 28.1TOPS/W compared to 11.8 for layer C1, since C1 uses only 6 of the 16 local arrays. Compared to prior digital accelerator implementations for MNIST, we achieve a >16x improvement in energy-efficiency, and a >60x higher FOM (energy-efficiency × throughput/SRAM size) due to the massively parallel in-memory analog computations. This demonstrates that the proposed SRAM-embedded architecture is capable of highly energy-efficient convolution computations that could enable low-power ubiquitous ML applications for a smart Internet-of-Everything.

#### Acknowledgements:

This project was funded by Intel Corporation. The authors thank Vivienne Sze and Hae-Seung Lee for their helpful technical discussions.

#### References:

- [1] M. Rastegari, et al., "XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks", *arXiv:1603.05279*, 2016, <https://arxiv.org/abs/1603.05279>.
- [2] J. Sim, et al., "A 1.42TOPS/W Deep Convolutional Neural Network Recognition Processor for Intelligent IoT Systems", *ISSCC*, pp. 264-265, 2016.
- [3] B. Moons, et al., "A 0.3–2.6 TOPS/W Precision-Scalable Processor for Real-Time Large-Scale ConvNets", *IEEE Symp. VLSI Circuits*, 2016.
- [4] J. Zhang, et al., "A Machine-Learning Classifier Implemented in a Standard 6T SRAM Array", *IEEE Symp. VLSI Circuits*, 2016.
- [5] M. Kang, et al., "A 481pJ/decision 3.4M decision/s Multifunctional Deep In-memory Inference Processor using Standard 6T SRAM Array", *arXiv:1610.07501*, 2016, <https://arxiv.org/abs/1610.07501>.



$$Y = \sum_i W \times X = \sum_i \alpha \cdot w_i \times X_i = \frac{M}{N} \sum_i w_i \times X_i$$

$$w_i \in (+1, -1) \quad M, N \in I$$

$$Y_{OUT} = \frac{1}{N} \sum_i w_i \times X_i = ADC \left( \frac{1}{N} \sum_i w_i \times DAC(X_i) \right)$$

$$Vp(n)_{AVG} = \frac{1}{N} \sum_{i=0}^{N-1} |w_i| \times Vx_i \quad V_{YAVG} = Vp_{AVG} - Vn_{AVG}$$

Weight Mult. + Averaging

**LeNet-5 CNN Conv. Layer #2**

No. of occurrences  $\times 10^5$

$Y_{OUT}$

Figure 31.1.1: Concept of embedded convolution computation, performed by averaging in SRAM, for binary-weight convolutional neural networks.



Figure 31.1.2: Overall architecture of the Convolution-SRAM (CSRAM) showing local arrays, column-wise DACs and row-wise ADCs to implement convolution as weighted averaging.



Figure 31.1.3: Schematic and timing diagram for the column-wise GBL\_DAC, which converts the convolution digital input to an analog pre-charge voltage for the SRAM.



Figure 31.1.4: Architecture for the row-wise multiply-and-average (MAV<sub>a</sub>) and CSH\_ADC. Operational waveforms for convolution computation.



Figure 31.1.5: Measured performance of GBL\_DAC and CSH\_ADC in the CSRAM array. Also shown is the effect of the offset cancellation technique.



Figure 31.1.6: Energy and output distribution measured results for the first two convolutional-layers (C1, C3) of the LeNet-5 CNN for the MNIST dataset. Table showing comparison to prior work.



|                                   |                       |
|-----------------------------------|-----------------------|
| Technology                        | 65nm                  |
| CSRAM size                        | 16 Kb                 |
| CSRAM Area                        | 0.067 mm <sup>2</sup> |
| Array Organization                | 256x64 (10T bitcell)  |
| # of column DACs                  | 64                    |
| # of row ADCs                     | 16                    |
| Supply Voltage                    | 1.2V/ 0.9V            |
| Main clock frequency              | 6.7 MHz               |
| ADC clock frequency               | 364 MHz               |
| Max # of mult+avg per convolution | 64                    |
| Energy/ convolution               | 3.6 pJ                |

Figure 31.1.7: Die micrograph and test-chip summary table.

### 31.2 A 42pJ/Decision 3.12TOPS/W Robust In-Memory Machine Learning Classifier with On-Chip Training

Sujan Kumar Gonugondla, Mingu Kang, Naresh Shanbhag

University of Illinois at Urbana-Champaign, IL

Embedded sensory systems (Fig. 31.2.1) continuously acquire and process data for inference and decision-making purposes under stringent energy constraints. These *always-ON* systems need to track changing data statistics and environmental conditions, such as temperature, with minimal energy consumption. Digital inference architectures [1,2] are not well-suited for such energy-constrained sensory systems due to their high energy consumption, which is dominated (>75%) by the energy cost of memory read accesses and digital computations. In-memory architectures [3,4] significantly reduce the energy cost by embedding pitch-matched analog computations in the periphery of the SRAM bitcell array (BCA). However, their analog nature combined with stringent area constraints makes these architectures susceptible to process, voltage, and temperature (PVT) variation. Previously, off-chip training [4] has been shown to be effective in compensating for PVT variations of in-memory architectures. However, PVT variations are die-specific and data statistics in *always-ON* sensory systems can change over time. Thus, on-chip training is critical to address both sources of variation and to enable the design of energy efficient *always-ON* sensory systems based on in-memory architectures. The stochastic gradient descent (SGD) algorithm is widely used to train machine learning algorithms such as support vector machines (SVMs), deep neural networks (DNNs) and others. This paper demonstrates the use of on-chip SGD-based training to compensate for PVT and data statistics variation to design a robust in-memory SVM classifier.

Figure 31.2.2 shows the system architecture with an analog in-memory (IMCORE) block, a digital trainer, a control block (CTRL) for timing and mode selection, and a normal SRAM R/W interface. The system can operate in three modes: as a conventional SRAM, for in-memory inference, and in training mode. IMCORE comprises of a conventional 512×256 6T SRAM BCA and in-memory computation circuitry: 1) pulse width modulated (PWM) WL drivers to realize functional read (FR), 2) BL processors (BLPs) implementing signed multiplication, 3) cross BLP (CBLP) implementing summation, and 4) an A2D converter and a comparator bank to generate final decisions. While the IMCORE implements feedforward computations of the SVM algorithm, the trainer implements a batch mode SGD algorithm (update equations shown in Fig. 31.2.2) to train the SVM weights  $\mathbf{W}$  stored in the BCA. The input vectors  $\mathbf{X}$  are streamed into the input buffers in the trainer. A gradient estimate ( $\Delta$ ) is accumulated for each input, based on the label  $y_n$  and outputs  $\delta_{1,n}$  and  $\delta_{-1,n}$  of IMCORE. At the end of each batch, the accumulated gradient estimate ( $\Delta$ ) is used to update the weights in BCA via the SRAM's conventional R/W interface. While 16b weights are used in the trainer during the weight update, only 8b weights are used for feedforward/inference. The learning rate ( $\gamma$ ) and the regularization factor ( $\alpha$ ) can be reconfigured in powers of 2.

During feedforward computations,  $\mathbf{W}$  is read in the analog domain on the BLs and the input vectors ( $\mathbf{X}$ ) are transferred to the BLP via a 256b bus. The mixed-signal capacitive multiplier in the BLP realizes multiplication via sequential charge sharing, similar to the one introduced in [3]. Based on the sign of the weights, the multiplier outputs are charge shared either on the positive or on the negative CBLP rails across the BLs. The voltage difference of the negative and positive rails is proportional to the dot product  $\mathbf{W}^T \mathbf{X}$ . The rail values are either sampled and converted to a digital value by an ADC pair, or a decision is obtained directly via a comparator bank. Three comparators are used, where one generates the decision ( $\hat{y}$ ) while the other two comparators implement a SVM margin detector that triggers a gradient estimate update.

Functional read (Fig. 31.2.3) uses 4-parallel pulse-width and amplitude-modulated (PWAM) WL enable signals resulting in the BL discharge  $\Delta V_{BL}$  (or  $\Delta V_{BLB}$ ) proportional to the 4b  $\mathbf{W}_i$ 's, stored in a column-major format (Fig 31.2.3), in one precharge cycle. The BL discharges ( $V_{BL}$ ) proportional to 4b words read in the adjacent BLs are combined in a 1:16 ratio to realize an 8b read out. This enables 128-dimensional 8b vector processing per access. The weights are represented in 2's complement. A comparator detects the sign of  $\mathbf{W}_i$ , which is then used to select its magnitude, both of which are passed on to the signed multipliers. Spatial variations impacting  $\Delta V_{BL}$  is measured across 30 randomly chosen 4-row groups. When the maximum  $\Delta V_{BL}$  ( $\Delta V_{BL,max}$ ), corresponding to 4b  $\mathbf{W}_i = 15$ , is set to 320mV

the maximum variation in  $\Delta V_{BL}$  ( $(\sigma/\mu)_{max}$ ), across all 16 values, is found to be 16% vs. 7% at  $\Delta V_{BL,max} = 560$ mV. This increase in variation leads to an increase in the misclassification rate: from 4% to 18%.

The MIT CBCL face detection data set is used for testing: it consists of 4000 training images and 858 test images. During training, input vectors are randomly sampled, with replacement, from the training set. At the end of each batch, the classifier is tested on the test set to obtain the misclassification (error) rate. Figure 31.2.4 shows the benefits of on-chip learning to overcoming process and data variations, and the need for learning chip-specific weights. Beginning with random initial weights and a  $\Delta V_{BL,max} = 560$ mV, the learning curves converge to within 1% of floating-point accuracy in 400 batch updates for learning rates  $\gamma \geq 2^{-4}$ . The misclassification rate increases dramatically to 18%, when  $\Delta V_{BL,max}$  is reduced to 320mV, at batch number 400 due to the increased impact of process variations during FR. Continued on-chip learning reduces this misclassification rate down to 8% for  $\gamma \geq 2^{-4}$ . Similar results are observed when the illumination changes abruptly at batch number 400 indicating robustness to variation in data statistics. The table in Fig. 31.2.4 shows the misclassification rate measured across 5 chips when the weights are trained on one chip and used in others. The use of chip-specific weights (diagonal) results in an average misclassification rate of 8.4% vs. 43% when they are not, highlighting the need for on-chip learning.

Figure 31.2.5 shows the trade-off between the misclassification rate, IMCORE energy, and  $\Delta V_{BL,max}$ . On-chip training enables the IC to achieve a misclassification rate below 8% at a 38% lower  $\Delta V_{BL,max}$  (320mV) and a lower IMCORE supply  $V_{DD,IMCORE}$  (0.675V), compared to the use of weights obtained at  $\Delta V_{BL,max}$  of 560mV and a  $V_{DD,IMCORE}$  of 0.925V. Thus, the IMCORE energy is reduced by 2.4x without any loss in accuracy. The energy cost of training is dominated by SRAM writes of updated weights done once per batch. This cost reduces with the batch size ( $N$ ) reaching 26% of the total energy cost, for a batch size of 128. At this batch size, 60% of the total energy is due to the control block; this energy overhead reduces with increasing SRAM size.

Figure 31.2.6 shows an IMCORE energy efficiency of 42pJ/decision at a throughput of 32M decisions/s, which corresponds to a computational energy efficiency of 3.12TOPS/W (1OP is a single 8b×8b MAC operation). This work achieves the lowest reported precision-scaled MAC energy, as well as the lowest reported MAC energy when SRAM memory access costs are included. Energy consumption of a digital architecture [1,2] to realize the 128-dimensional SVM algorithm of this work is estimated from their MAC energy, which shows a savings of greater than 7x, thereby demonstrating the suitability of this work for energy-constrained sensory applications.

The die micrograph of the 65nm CMOS IC and performance summary is shown in Fig. 31.2.7.

#### Acknowledgements:

This work was supported in part by Systems On Nanoscale Information fabriCs (SONIC), one of the six SRC STARnet Centers, sponsored by MARCO and DARPA. The authors would like to acknowledge constructive discussions with Professors Pavan Hanumolu, Naveen Verma, Boris Murmann, and David Blaauw.

#### References:

- [1] Y.H. Chen, et al., "Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks," *ISSCC*, pp. 262-263, 2016.
- [2] P.N. Whatmough, et al., "A 28nm SoC with a 1.2GHz 568nJ/prediction sparse deep-neural-network engine with >0.1 timing error rate tolerance for IoT applications," *ISSCC*, pp. 242-243, 2017.
- [3] M. Kang, et al., "A 481pJ/decision 3.4 M decision/s multifunctional deep in-memory inference processor using standard 6T SRAM array," *arXiv:1610.07501*, 2016, <https://arxiv.org/abs/1610.07501>.
- [4] J. Zhang, et al., "In-memory computation of a machine learning classifier in a standard 6T SRAM array," *JSSC*, vol. 52, no. 4, pp. 915-924, April 2017.
- [5] E.H. Lee, et al., "A 2.5GHz 7.7TOPS/W switched-capacitor matrix multiplier with co-designed local memory in 40nm," *ISSCC*, pp. 418-419, 2016.
- [6] S. Joshi, et al., "2pJ/MAC 14b 8×8 linear transform mixed-signal spatial filter in 65nm CMOS with 84dB interference suppression," *ISSCC*, pp. 364-365, 2017.



Figure 31.2.1: An SGD-based on-chip learning system for robust energy efficient always-ON classifiers.



Figure 31.2.2: Proposed SGD-based in-memory classifier architecture.



Figure 31.2.3: In-memory functional read, measured spatial variations on BL swing, and its impact on the measured SVM misclassification rate.



Figure 31.2.5: Measured energy via supply voltage and BL swing scaling. Energy cost of training.



Figure 31.2.4: Measured robustness to spatial variations and non-stationary data.

|                                                                                          | [1]                               | [2]                             | [5]                             | [6]                 | [3]       | [4]       | this work          |
|------------------------------------------------------------------------------------------|-----------------------------------|---------------------------------|---------------------------------|---------------------|-----------|-----------|--------------------|
| Technology                                                                               | 65nm                              | 28nm HPC                        | 40nm                            | 65nm                | 65nm      | 180nm     | 65nm               |
| Algorithm                                                                                | CNN                               | FC-DNN                          | matrix mult.                    | filtering           | SVM       | AdaBoost  | SVM                |
| Data set                                                                                 | ImageNet                          | MNIST                           |                                 |                     | MIT-CBCL  | MNIST     | MIT-CBCL           |
| Architecture                                                                             | digital                           | digital                         | analog                          | analog              | in-memory | in-memory | in-memory          |
| On-chip learning                                                                         | No                                | No                              | No                              | No                  | No        | No        | Yes                |
| Total SRAM size (kb)                                                                     | 1449.2                            | 9248                            | —                               | —                   | 128       | 103.6     | 128                |
| Energy/Decision                                                                          | 7.94mJ <sup>d</sup>               | 0.56μJ                          | —                               | —                   | 0.4nJ     | 0.6nJ     | 0.042nJ            |
| Decisions/s                                                                              | 35                                | 28.8k <sup>d</sup>              | —                               | —                   | 9.2M      | 7.9M      | 32M                |
| # of MACs/Decision                                                                       | 2663M                             | 334k                            | —                               | —                   | 512       | —         | 128                |
| Max. accuracy (%)                                                                        | —                                 | 98                              | —                               | —                   | 96        | 91        | 96                 |
| MAC level metrics                                                                        |                                   |                                 |                                 |                     |           |           |                    |
| MAC precision <sup>a</sup> ( $B_x \times B_w$ )                                          | 16 <sup>b</sup> × 16 <sup>c</sup> | 8 <sup>b</sup> × 8 <sup>c</sup> | 3 <sup>b</sup> × 6 <sup>c</sup> | 8 × 14 <sup>c</sup> | 8 × 8     | 5 × 1     | 8 × 8 <sup>c</sup> |
| Efficiency (TOPS/W)                                                                      | 0.336 <sup>d</sup>                | 0.56 <sup>d</sup>               | 3.84 <sup>b</sup>               | 0.5 <sup>b</sup>    | 1.25      | —         | 3.125              |
| MAC energy ( $E_{MAC}$ ) (pJ)                                                            | 2.98 <sup>d</sup>                 | 1.79 <sup>d</sup>               | 0.26 <sup>b</sup>               | 2 <sup>b</sup>      | 0.8       | —         | 0.32               |
| precision-scaled MAC energy <sup>e</sup> (fJ)                                            | 11.6                              | 28                              | 14.4 <sup>b</sup>               | 17.857 <sup>b</sup> | 12.5      | —         | 4.9                |
| Estimated performance of prior art to realize SVM algorithm with vector dimension of 128 |                                   |                                 |                                 |                     |           |           |                    |
| Energy/Decision (nJ)                                                                     | 0.381                             | 0.229                           | 0.033 <sup>b</sup>              | 0.256 <sup>b</sup>  | 0.102     | —         | 0.042              |
| Decisions/s                                                                              | 250M                              | 75M                             | 19.5M                           | 350k                | 36.8M     | —         | 32M                |
| # MACs per cycle                                                                         | 168                               | 8                               | 1                               | 64                  | 256       | 10,368    | 128                |

<sup>a</sup>s indicates signed. <sup>b</sup>x: input precision; <sup>c</sup>w: weight precision <sup>d</sup>normalized to account for operand precision ( $E_{MAC}/(B_x \times B_w)$ ) <sup>e</sup>estimated from reported data

Figure 31.2.6: Comparison table.

|  <p>1.2 mm</p> <p>1.2 mm</p> | <table border="1"><thead><tr><th>Technology</th><th colspan="2">65nm CMOS</th></tr></thead><tbody><tr><td>Die size</td><td colspan="2">1.2 mm × 1.2mm</td></tr><tr><td>Memory capacity</td><td colspan="2">16KB (512 × 256)</td></tr><tr><td>Nominal supply</td><td colspan="2">1.0 V</td></tr><tr><td>CTRL operating frequency</td><td colspan="2">1 GHz</td></tr><tr><td>Energy per decision (nJ)</td><td>Test</td><td>0.21</td></tr><tr><td></td><td>Training</td><td>0.34</td></tr><tr><td>Average throughput (decision/s)</td><td>Test</td><td>32.3 M</td></tr><tr><td></td><td>Training</td><td>21 M</td></tr></tbody></table> | Technology | 65nm CMOS |  | Die size | 1.2 mm × 1.2mm |  | Memory capacity | 16KB (512 × 256) |  | Nominal supply | 1.0 V |  | CTRL operating frequency | 1 GHz |  | Energy per decision (nJ) | Test | 0.21 |  | Training | 0.34 | Average throughput (decision/s) | Test | 32.3 M |  | Training | 21 M |
|--------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------|-----------|--|----------|----------------|--|-----------------|------------------|--|----------------|-------|--|--------------------------|-------|--|--------------------------|------|------|--|----------|------|---------------------------------|------|--------|--|----------|------|
| Technology                                                                                                   | 65nm CMOS                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |            |           |  |          |                |  |                 |                  |  |                |       |  |                          |       |  |                          |      |      |  |          |      |                                 |      |        |  |          |      |
| Die size                                                                                                     | 1.2 mm × 1.2mm                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |            |           |  |          |                |  |                 |                  |  |                |       |  |                          |       |  |                          |      |      |  |          |      |                                 |      |        |  |          |      |
| Memory capacity                                                                                              | 16KB (512 × 256)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |            |           |  |          |                |  |                 |                  |  |                |       |  |                          |       |  |                          |      |      |  |          |      |                                 |      |        |  |          |      |
| Nominal supply                                                                                               | 1.0 V                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |            |           |  |          |                |  |                 |                  |  |                |       |  |                          |       |  |                          |      |      |  |          |      |                                 |      |        |  |          |      |
| CTRL operating frequency                                                                                     | 1 GHz                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |            |           |  |          |                |  |                 |                  |  |                |       |  |                          |       |  |                          |      |      |  |          |      |                                 |      |        |  |          |      |
| Energy per decision (nJ)                                                                                     | Test                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | 0.21       |           |  |          |                |  |                 |                  |  |                |       |  |                          |       |  |                          |      |      |  |          |      |                                 |      |        |  |          |      |
|                                                                                                              | Training                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | 0.34       |           |  |          |                |  |                 |                  |  |                |       |  |                          |       |  |                          |      |      |  |          |      |                                 |      |        |  |          |      |
| Average throughput (decision/s)                                                                              | Test                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | 32.3 M     |           |  |          |                |  |                 |                  |  |                |       |  |                          |       |  |                          |      |      |  |          |      |                                 |      |        |  |          |      |
|                                                                                                              | Training                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | 21 M       |           |  |          |                |  |                 |                  |  |                |       |  |                          |       |  |                          |      |      |  |          |      |                                 |      |        |  |          |      |
|                                                                                                              |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |            |           |  |          |                |  |                 |                  |  |                |       |  |                          |       |  |                          |      |      |  |          |      |                                 |      |        |  |          |      |
|                                                                                                              |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |            |           |  |          |                |  |                 |                  |  |                |       |  |                          |       |  |                          |      |      |  |          |      |                                 |      |        |  |          |      |

Figure 31.2.7: Die micrograph and chip summary.

### 31.3 Brain-Inspired Computing Exploiting Carbon Nanotube FETs and Resistive RAM: Hyperdimensional Computing Case Study

Tony F. Wu<sup>1</sup>, Haitong Li<sup>1</sup>, Ping-Chen Huang<sup>2</sup>, Abbas Rahimi<sup>2</sup>,  
Jan M. Rabaey<sup>2</sup>, H.-S. Philip Wong<sup>1</sup>, Max M. Shulaker<sup>3</sup>, Subhasish Mitra<sup>1</sup>

<sup>1</sup>Stanford University, Stanford, CA

<sup>2</sup>University of California, Berkeley, Berkeley, CA

<sup>3</sup>Massachusetts Institute of Technology, Cambridge, MA

We demonstrate an end-to-end brain-inspired hyperdimensional (HD) computing nanosystem, effective for cognitive tasks such as language recognition, using heterogeneous integration of multiple emerging nanotechnologies. It uses monolithic 3D integration of carbon nanotube field-effect transistors (CNFETs, an emerging logic technology with significant energy-delay product (EDP) benefit vs. silicon CMOS [1] and Resistive RAM (RRAM, an emerging memory that promises dense non-volatile and analog storage [2]). Due to their low fabrication temperature (<250°C), CNFETs and RRAM naturally enable monolithic 3D integration with fine-grained and dense vertical connections (exceeding various chip stacking and packaging approaches) between computation and storage layers using back-end-of-line inter-layer vias [3]. We exploit RRAM and CNFETs to create area- and energy-efficient circuits for HD computing: approximate accumulation circuits using gradual RRAM reset operation (in addition to RRAM single-bit storage) and random projection circuits that embrace inherent variations in RRAM and CNFETs. Our results demonstrate: 1. pairwise classification of 21 European languages with measured accuracy of up to 98% on >20,000 sentences (6.4 million characters) per language pair. 2. One-shot learning (i.e., learning from few examples) using one text sample (~100,000 characters) per language. 3. Resilient operation (98% accuracy) despite 78% hardware errors (circuit outputs stuck at 0 or 1). Our HD nanosystem consists of 1,952 CNFETs integrated with 224 RRAM cells.

For language classification using HD computing, an input sentence (entered serially character by character) is first time-encoded (input characters are mapped to the delay of the rising-edge of the signal *in* with respect to the reference clock, *clk1*, Fig. 31.3.1). The time-encoded sentence is then transformed into a *query hypervector* (QV) using 4 HD operations: 1. *random projection* (each character in a sentence is mapped to a binary hyperdimensional vector (HV), Projection unit, Fig. 31.3.1); 2. *HD permutation* (1-bit rotating shift of HV, HD Permute unit, Fig. 31.3.1); 3. *HD multiplication* (bit-wise XOR of two HVs, HD Multiply unit, Fig. 31.3.1); and 4. *HD addition* (bit-wise accumulation, HD Accumulator unit, Fig. 31.3.1). During training, this QV is chosen to represent a language and stored in RRAM, either in location 0 or location 1 (RRAM-based classifier, Fig. 31.3.1). During inference, the QV is compared with all stored HVs (from training) and the language that corresponds to the least Hamming distance (current on m10 for location 0 and m11 for location 1) is chosen as the output language.

Figure 31.3.2 shows the overall nanosystem. Our design (exploiting CNFETs, RRAM, and monolithic 3D) provides significant EDP and area benefits vs traditional 2D silicon CMOS-based digital design (e.g., projected 35x EDP, 3x area benefits for pair-wise language classification HD when compared at 28 nm technology node, estimated using simulations after place-and-route). A key requirement for HD computing is to achieve random projections of inputs to HVs [4]. To do so, we embrace the following inherent variations in nanotechnologies using a delay cell (PMOS-only CNFET inverter with an additional RRAM in the pull-down network, Fig. 31.3.2 and Fig. 31.3.3): variations in RRAM resistance and variations in CNFET drive current resulting from variations in carbon nanotube (CNT) count (i.e., the number of CNTs in a CNFET) or threshold voltage.

To generate HVs, each possible input (26 letters of the alphabet and the space character [4]) is time-encoded and mapped to an evenly-spaced delay (maximum delay *T*, Fig. 31.3.1). To calculate each bit of the HV, random delays are added to *clk1* and input (*in*) using delay cells. If the resulting signals are coincident (the falling edges are close enough to set the SR latch) the output is '1' (Fig. 31.3.3). To initialize delay cells, the RRAM resistance is first reset to a high-resistance state (HRS) and then set to a low-resistance state (LRS). This process is performed before training. During random projection operation, the RRAM in each delay cell is static (the voltage across the RRAM (0-1V) is not enough to change its resistance).

HD computing generally requires HVs representing these 27 possible inputs to be nearly orthogonal to each other (Hamming distance close to 50% of the vector dimension). This requirement (e.g., 16 for 32 bits) is fulfilled as delay variation ( $\sigma/\mu$ ) of delay cells increases (Fig. 31.3.4,  $\sigma$ : standard deviation of delays,  $\mu$ : mean delay). By combining the RRAM resistance variations and CNFET drive current variations, our delay cells achieve experimentally characterized delay variations with  $\sigma/\mu = 1.5$ , corresponding to mean Hamming distance of 16 for 32 bits (Fig. 31.3.4), sufficient for random projection in HD computing. Other approaches (e.g., operating in subthreshold voltages to exploit inherent variations [5] or pseudo-random number generators such as linear feedback shift registers and its variants) can also be used to generate HVs.

Bit-wise accumulation corresponds to counting the number of ones in each bit location of the HV. For example, bit-wise accumulation of binary vectors 0100, 0101, and 1011 produces (1,2,1,2). Thresholding is performed to transform the vector back into a binary vector [4] (e.g., (1,2,1,2) turns into 0101). The threshold value is set to half of the number of total HVs accumulated (e.g., threshold 50 for 100-character sentence) [4]. For language classification, the average input sentence contains fewer than 128 characters (requiring 7-bits of precision to accumulate 128 HVs). Here, we use an approximate accumulator with thresholding: we leverage the multiple values of RRAM resistance that can be programmed by performing a gradual reset ( $V_{top} - V_{bot}$ : -2.6V) (when the RRAM is in the low resistance state, multiple reset pulses (50μs pulse width, 1ms period) are applied to gradually increase the resistance to the high resistance state) to count the number of ones (Fig. 31.3.3) [2]. Since the accumulated value is thresholded to a binary value, the impact of accumulation error is somewhat mitigated (with mean cycle-to-cycle error 4%, showing consistency across time) (Fig. 31.3.4). A digital buffer is used to transform (threshold) the sum to a binary vector (when clk is 0V and  $V_{top} - V_{bot}$  is +0.5V). To clear the sum, a set operation is performed on the RRAM ( $V_{top} - V_{bot}$ : +2.6V). Each such approximate accumulator uses 8 transistors and a single RRAM cell. In contrast, a digital 7-bit accumulator may use 240 transistors. Thus, when D (e.g. 10,000) accumulators (where D is the HV dimension) are needed, the savings can be significant.

We create multiple functional units of the HD encoder (Fig. 31.3.5). These functional units, when connected in parallel (Fig. 31.3.2), form the HD encoder.

The classifier is implemented using 2T2R (2-CNFET transistor, 2-RRAM) ternary content-addressable memory (TCAM) cells to form an associative memory [6]. During training, the matchline (ML) (i.e. m10 or m11) corresponding to the language (e.g., m10 for English and m11 for Spanish) is set to 3V, writing the QV into the RRAM cells connected to the ML (Fig. 31.3.2). During inference, the MLs (i.e. m10 and m11) are set to a low voltage (e.g. 0.5V), and the current on each ML is read as an output. When the QV bit is equal to the value stored in a TCAM cell (match), the current is high. Otherwise (mismatch), the current is low. An individual TCAM cell has a match / mismatch current ratio of ~20 (Fig. 31.3.6). Cell currents are summed on each ML. The line with the most current corresponds to the output class (read and compared off-chip). Figure 31.3.6 shows a distribution of ML currents that correspond to Hamming distance. Our HD computing nanosystem is resilient to errors: despite 78% of QV bits (25 out of 32) stuck at 0 or 1 (characterized by measuring functional units in Fig. 31.3.5), our overall nanosystem still achieves 98% classification accuracy.

We emulate larger HD computing systems (more bits per HV) by running our fabricated nanosystem iteratively. In each iteration, language pairs are trained (each language is trained on one text sample of 100,000 characters, which uses the same amount of time per character as an inference: one-shot learning, Fig. 31.3.6) and all inferences are performed. After each iteration, the RRAM in the delay cells are cycled (i.e. reset to HRS then set to LRS, providing new projection of HVs for each iteration). After all iterations, the ML currents of the corresponding sentences are summed and compared. For a single iteration, the accuracy is 59%. Using 256 iterations with 22% non-stuck QV bits (D=8192, with 1792 non-stuck QV bits), our HD nanosystem can categorize between 2 European languages with a mean accuracy of 98% (Fig. 31.3.6). The classification energy per iteration is measured at 540μJ (3V supply, average power 5.4mW, 1kHz clock frequency, 1μm gate length). The reported accuracy is the percentage of 84,000 sentences that were classified correctly (dataset: 420 language pairs, 200 sentences per language pair). Software HD implementations (using 8192-bit HVs with HV sparsity 0.5) on general-purpose processors achieve 99.2% accuracy. This work illustrates how properties of heterogeneous emerging nanotechnologies can be effectively exploited and combined to realize brain-inspired computing architectures that: tightly integrate computing and storage, provide energy-efficient computation, employ approximation, embrace randomness, and exhibit resilience to errors.

#### Acknowledgements

Work supported in part by DARPA, NSF/NRI/GRC E2CDA, STARnet SONIC, and Stanford SystemX Alliance. We thank Edith Beigne of CEA-LETI for fruitful discussions.

#### References

- [1] L. Chang, et al., "Technology Optimization for High Energy-Efficiency Computation," Short course, *IEDM*, 2012.
- [2] H.-S. P. Wong, et al., "Metal-oxide RRAM," *Proc. of IEEE*, vol. 100, no.6, pp 1951-1970, May 2012.
- [3] M. M. Shulaker, et. al., "Three-dimensional integration of nanotechnologies for computing and data storage on a single chip," *Nature*, no. 547, pp. 74-78, July 2017.
- [4] A. Rahimi, et. al., "Robust and Energy-Efficient Classifier Using Brain-Inspired Hyperdimensional Computing," *Int. Symp. on Low Power Elec. & Design*, pp. 64-69, 2016.
- [5] S. Hanson, et al., "Energy Optimality and Variability in Subthreshold Design," *Int. Symp. on Low Power Elec. & Design*, pp. 363-365, 2006.
- [6] L. Zheng, et. al., "RRAM-based TCAMs for pattern search," *IEEE ISCAS*, pp. 1382-1385, 2016.



Figure 31.3.1: HD computing architecture and time-encoding of input sentences.



Figure 31.3.2: Schematic of HD computing nanosystem built using monolithic 3D integration of CNFETs and RRAM.



Figure 31.3.3: Implementation of a delay cell and approximate accumulator using CNFETs and RRAM.



Figure 31.3.4: Characterization of the delay distribution of the delay cells.



Figure 31.3.5: Input and output waveforms of a functional unit of the HD encoder with outputs of 32 approximate accumulators overlaid.



Figure 31.3.6: Pairwise classification of 21 European languages.



Figure 31.3.7: Die micrograph.

### 31.4 A 65nm 1Mb Nonvolatile Computing-in-Memory ReRAM Macro with Sub-16ns Multiply-and-Accumulate for Binary DNN AI Edge Processors

Wei-Hao Chen, Kai-Xiang Li, Wei-Yu Lin, Kuo-Hsiang Hsu, Pin-Yi Li, Cheng-Han Yang, Cheng-Xin Xue, En-Yu Yang, Yen-Kai Chen, Yun-Sheng Chang, Tzu-Hsiang Hsu, Ya-Chin King, Chorng-Jung Lin, Ren-Shuo Liu, Chih-Cheng Hsieh, Kea-Tiong Tang, Meng-Fan Chang

National Tsing Hua University, Hsinchu, Taiwan

Many artificial intelligence (AI) edge devices use nonvolatile memory (NVM) to store the weights for the neural network (trained off-line on an AI server), and require low-energy and fast I/O accesses. The deep neural networks (DNN) used by AI processors [1,2] commonly require p-layers of a convolutional neural network (CNN) and q-layers of a fully-connected network (FCN). Current DNN processors that use a conventional (von-Neumann) memory structure are limited by high access latencies, I/O energy consumption, and hardware costs. Large working data sets result in heavy accesses across the memory hierarchy, moreover large amounts of intermediate data are also generated due to the large number of multiply-and-accumulate (MAC) operations for both CNN and FCN. Even when binary-based DNN [3] are used, the required CNN and FCN operations result in a major memory I/O bottleneck for AI edge devices.

A compute in memory (CIM) approach, especially when paired with high-density NVM (nvCIM), can enhance DNN operations for AI edge processors [4,5] by, (1) avoiding latencies due to data access from multi-layer memory hierarchies (NVM-DRAM-SRAM) by storing most, or all, of the weights in the nvCIM; (2) reducing intermediate data access; (3) shortening the latency of multiple MAC operations to one CIM-cycle. As will be discussed later, many challenges and restrictions to circuit designs and NVM devices prevent the realization of nvCIM. Thus far, only one 32x32 CIM ReRAM macro, with a high resistance-ratio (R-ratio) has been demonstrated [6]. Moreover, the 1T1R cell-read currents ( $I_{LRS}$  and  $I_{HRS}$ ), for both the high- ( $R_{HRS}$ ,  $R_{LRS}$ ) and low-resistance-states (LRS,  $R_{LRS}$ ), have the same polarity. This prevents nvCIM from implementing both positive and negative weights on the same BL, as is required for software-driven DNN structures, such as XNOR nets. To achieve a smaller energy-hardware cost, this work proposes a hardware-driven binary-input ternary-weighted (BITW) network using our pseudo-binary nvCIM macros and a two-macro (nvCIM-P and nvCIM-N) DNN structure [5]: the nvCIM-P macro stores positive weights while the nvCIM-N stores negative weights. The BITW network combines ternary weights [4] (+1, 0 and -1) and a modified binary input (1 and 0) with an in-house computing flow for subtraction (S), activation (A) and max-pooling (MP). This work proposes various circuit-level techniques for the design of a 65nm 1Mb pseudo-binary nvCIM ReRAM macro capable of storing 512k weights, and performing up to 8k MAC operations within one CIM cycle. We demonstrate the first megabit nvCIM ReRAM macro for CNN/FCN, and the fastest (<16ns) CIM operation for NVMs.

Figure 31.4.2 shows the structure of our BITW nvCIM 1T1R ReRAM macro: comprising of a dual-mode WL driver (D-WLDR), a distance-racing current-mode sense amplifier (DR-CSA), a reference current ( $I_{REF}$ ) generator (REFG), and 1T1R ReRAM cell arrays. All binary weights (W) of each  $n \times n$  kernel in a CNN layer or  $m$  weights in a FCN layer are stored in memory cells on the same BL. A 0-weight is encoded as HRS, while a +1 weight is encoded as LRS in nvCIM-P, and a -1 as LRS in nvCIM-N. Digital inputs are applied via WL. When WL=0 the cell current ( $I_c$ ) of a cell is zero. When WL=1,  $I_c$  is either  $I_{HRS}$  (for 0-weight) or  $I_{LRS}$  (+1 or -1-weight). If the cell's R-ratio is large ( $I_{HRS} \ll I_{LRS}$ ), then  $I_c$  is equal to the product of input  $\times$  weight. Using a current-mode read scheme, the action of turning on [0,  $n^2$  or  $m$ ] WL simultaneously results in an  $I_{BL}$  that is equal to summation of all activated cells ( $I_c$ ). MAC values (MACV) which refer to the number of  $IN \times W = \pm 1$  are differentiated using a multi-level (ML) CSA to resolve  $I_{BL}$  and output a  $j$ -bit digital code. The  $j$  values are determined by an algorithm in an NN model structure. The proposed nvCIM with  $v$ -columns and  $s$ -IO can perform up to  $(s \times n^2$  or  $s \times m)$  MAC operations within one CIM cycle, where  $s \leq v$ .

There are three major challenges in using a nvCIM ReRAM macro. (1) Large input offset ( $I_{OS}$ ) for the CSA when  $I_{BL}$  is large. (2) A reduced sensing margin (SM), when using a conventional mid-point  $I_{REF}$  read scheme. (3) A small sense margin across various computing (MACV) states due to input-patterns and process variation; particularly when the R-ratio is small (significant  $I_{HRS}$ ). We propose a DR-CSA to suppress  $I_{OS}$  at high  $I_{BL}$  while efficiently utilizing the signal margin in order to overcome (1) and (2). Furthermore, we proposed an input-aware dynamic  $I_{REF}$  generation scheme (IA-REF) to combat (3).

Figure 31.4.3 presents DR-CSA, comparing the distance between  $I_{BL}$  and two  $I_{REF}$  ( $I_{REF\_H}$ ,  $I_{REF\_L}$ ), using the top (PT1, PT2, PT3) and bottom (PB1, PB2, PB3) current paths. In phase-1, P0-P2 precharges the BL and reference-BLs (RBL). S1 is on to store the gate-source voltage of P0 ( $V_{GS\_P0}$ ) and P1 ( $V_{GS\_P1}$ ) on CO/C1 to sample the  $I_{BL}/I_{REF\_H}$  (at

PT2/PT3) as in [7]. In phase-2, PRE/S1 is off and S2/S3 are on. Path PT2 ( $I_{BL}$ ) is connected to PB1 ( $I_{REF\_L}$ ), while PT3 ( $I_{REF\_H}$ ) is connected to PB2 ( $I_{BL}$ ). For a given period  $T_{P2}$ , the voltage at node Q ( $V_Q$ ) is  $T_{P2}(I_{BL}-I_{REF\_L})$ , while the voltage at node Q ( $V_{QB}$ ) is  $T_{P2}(I_{REF\_H}-I_{BL})$ . The difference between  $V_Q$  and  $V_{QB}$  ( $\Delta V_{Q-QB} = |V_Q - V_{QB}|$ ) is  $(2/I_{BL}) \cdot (I_{REF\_H} - I_{REF\_L})/T_{P2}$ . In phase-3, S2 is off, and the cross-couple pair N3/N4 amplifies  $\Delta V_{Q-QB}$  as the phase-3 period ( $T_{P3}$ ) increases. Finally in phase-4 SAEN is on to generate a digital output (SAO) based on  $\Delta V_{Q-QB}$ . In short, the sensing margin of DR-CSA is  $(2/I_{BL}) \cdot (I_{REF\_H} + I_{REF\_L})$ , which is approximately 2 $\times$  larger than a conventional mid-point CSA ( $I_{BL} \cdot (I_H + I_L)/2$ ). Repeating this sensing procedure  $j$  times generates a  $j$ -bit output by DR-CSA. Many BLs share one CSA; therefore, the area overhead of DR-CSA is less than 1.54% for a 1Mb macro.

Figure 31.4.4 presents the IA-REF scheme comprising of a reference-WL controller (RWLC), input-counter (ICN), and input-aware replica rows (IA-RR). A limited R-ratio and significant  $I_{HRS}$  leads to overlap in  $I_{BL}$  across neighboring MACV values, resulting in read failures. For example, the  $I_{BL}$  for MACV=1 (one LRS cell) may be due to one WL on (one LRS cell, 1LOH) or nine WLs on (one LRS and 8 HRS cells, 1L8H). The  $I_{BL\_1L8H}$  is larger than the minimum  $I_{BL}$  for MACV=2 (two LRS cells accessed by two WLs, 2LOH). Fortunately, for MACV with the same number ( $N_{WL}$ ) of WL on, there is spacing in  $I_{BL}$  between neighboring MACVs.

The top and bottom sub-arrays in this nvCIM macro operate as a pair. When accessing the top array, selected BLs in the bottom array serve as reference BLs (RBL). For a nvCIM with  $j$ -bit ML outputs, 2 $/j$  BLs per IO are used as RBLs. The input-counter counts the number inputs equaling 1 in the input pattern, and outputs  $N_{WL}$  to RWLC. Then, RWLC turns on  $N_{WL}$  replica WLs (RWLs), from RWL[1] to RWL[N<sub>WL</sub>]. IA-RR includes krows of RWLs with designed patterns on their replica memory cells (RMC) to generate various  $I_{REF}$  across RBLs for DR-CSA. A RBL[r] has  $r$  LRS cells, where the range of  $m$  is [0, (2 $-1$ )]. Each ML sensing cycle activates only two RBLs for the required  $I_{REF\_H}$  and  $I_{REF\_L}$ .

For example, there are 9 RWLs and 8 RBLs in a IA-RR for a CNN operation with 3 $\times$ 3 kernels (max. 9 inputs) using 3-bit ML DR-CSA ( $j=3$ ). If an ML sensing cycle requires  $I_{REF\_L}=3L5H$  and  $I_{REF\_H}=4L4H$  at  $N_{WL}=8$ , then RWL[1]-RWL[8] are on and RBL[3] and RBL[4] are activated. RBL[3] has 3 LRS cells (RMC[3:1,3]) and 5 HRS cells (RMC[8:4,3]) activated to generate  $I_{BL3}=3I_{LRS}+5I_{HRS}$ . RBL[4] has 4 LRS cells (RMC[4:1,4]) and 4 HRS cells (RMC[8:5,4]) activated to generate  $I_{BL4}=4I_{LRS}+4I_{HRS}$ . Figure 31.4.5 shows that with a larger sensing margin and current-distance scheme, the proposed DR-CSA enables a 5-6 $\times$  reduction in input offset, compared to a conventional CSA with an  $I_{BL}$  exceeding 40uA. With the same sensing yield across MACVs, DR-CSA can tolerate a minimum R-ratio that is 4 $\times$  smaller than a conventional CSA. The IA-REF scheme increased the worst-case signal margin of MACVs from -27.9uA to 7.8uA. DR-CSA and IA-REF enhance the accuracy of text recognition/inference operations (i.e. MNIST database) by 50 $\times$  for binary DNN, compared to a conventional CIM (not using DR-CSA and IA-REF).

Figure 31.4.6 presents the measured results of a 1Mb nvCIM ReRAM macro fabricated using 1T1R contact-ReRAM cell in a 65nm CMOS logic process. For CNN operations using 3 $\times$ 3 kernels the measured CIM access time ( $T_{AC-CIM}$ ) excluding the path-delay ( $T_{PATH}$ ) is 14.8ns. In FCN mode, the shmoo plot confirmed a  $T_{AC-CIM}=15.6$ ns across various input patterns with 25 inputs per operation. This work increased the computing speed by 2 $\times$  and capacity by 1000 $\times$ , compared to previous ReRAM nvCIM [6]. Figure 31.4.7 presents a die photo and chip summary.

#### Acknowledgements:

The authors would like to thank the support from NVM-DTP of TSMC, TSMC-JDP and MOST-Taiwan.

#### References:

- [1] M. Price, et al., "A Scalable Speech Recognizer with Deep-Neural-Network Acoustic Models and Voice-Activated Power Gating," *ISSCC*, pp. 244-245, 2017.
- [2] D. Shin, et al., "DNPU: An 8.1TOPS/W reconfigurable CNN-RNN processor for general-purpose deep neural networks," *ISSCC*, pp. 240-241, 2017.
- [3] I. Hubara, et al., "Binarized Neural Networks," *Proc. NIPS*, pp. 4107-4115, 2016.
- [4] Z. Lin, et al., "Neural networks with few multiplications," *arXiv:1510.03009*, 2016, <https://arxiv.org/abs/1510.03009>.
- [5] P. Chi, et al., "PRIME: A Novel Processing-in-Memory Architecture for Neural Network Computation in ReRAM-based Main Memory," *Int. Symp. On Comp. Arch.*, pp. pp. 27-39, 2016.
- [6] F. Su, et al., "A 462GOPs/J RRAM-Based Nonvolatile Intelligent Processor for Energy Harvesting IoE System Featuring Nonvolatile Logics and Processing-In-Memory," *IEEE Symp. VLSI Circuits*, 2017.
- [7] M.-F. Chang, et al., "An offset tolerant current-sampling-based sense amplifier for sub-100nA-cell-current nonvolatile memory," *ISSCC*, pp. 206-207, 2011.



Figure 31.4.1: Nonvolatile computation in memory (nvCIM) concept, with BITW-NN in AI edge processors.



Figure 31.4.2: Proposed macro structure of nvCIM and its challenges.



Figure 31.4.3: Structure and operation of distance-racing current-sense amplifier.



Figure 31.4.4: Proposed input-aware dynamic  $I_{REF}$  generation scheme.



Figure 31.4.5: Performance of proposed schemes.



Figure 31.4.6: Measured results.



|                                             |                                                                                                           |
|---------------------------------------------|-----------------------------------------------------------------------------------------------------------|
| Technology                                  | 65nm CMOS Logic Process                                                                                   |
| ReRAM                                       | Logic-process compatible<br>CRRAM (Unipolar)                                                              |
| Cell size(1T1R)                             | 0.25 $\mu\text{m}^2$                                                                                      |
| RRAM mode                                   | Memory/ CIM                                                                                               |
| CIM mode                                    | CNN/ FCN                                                                                                  |
| Capacity                                    | 1Mb (8 x128 Kb)                                                                                           |
| Sub-array                                   | 512 rows x 256 columns                                                                                    |
| Read Delay for<br>BL-length=512<br>@ VDD=1V | Memory mode: 5.0 ns(1bit Out)<br>CNN mode: 14.8 ns(3bits ML<br>Out)<br>FCN mode: 15.6 ns(3bits ML<br>Out) |

Figure 31.4.7: Die photo and summary table.

### 31.5 A 65nm 4Kb Algorithm-Dependent Computing-in-Memory SRAM Unit-Macro with 2.3ns and 55.8TOPS/W Fully Parallel Product-Sum Operation for Binary DNN Edge Processors

Win-San Khwa<sup>1,2</sup>, Jia-Jing Chen<sup>1</sup>, Jia-Fang Li<sup>1</sup>, Xin Si<sup>3</sup>, En-Yu Yang<sup>1</sup>, Xiaoyu Sun<sup>4</sup>, Rui Liu<sup>4</sup>, Pai-Yu Chen<sup>4</sup>, Qiang Li<sup>3</sup>, Shimeng Yu<sup>4</sup>, Meng-Fan Chang<sup>1</sup>

<sup>1</sup>National Tsing Hua University, Hsinchu, Taiwan; <sup>2</sup>TSMC, Hsinchu, Taiwan

<sup>3</sup>University of Electronic Science and Technology of China, Sichuan, China

<sup>4</sup>Arizona State University, Tempe, AZ

For deep-neural-network (DNN) processors [1-4], the product-sum (PS) operation predominates the computational workload for both convolution (CNVL) and fully-connect (FCNL) neural-network (NN) layers. This hinders the adoption of DNN processors to on the edge artificial-intelligence (AI) devices, which require low-power, low-cost and fast inference. Binary DNNs [5-6] are used to reduce computation and hardware costs for AI edge devices; however, a memory bottleneck still remains. In Fig. 31.5.1 conventional PE arrays exploit parallelized computation, but suffer from inefficient single-row SRAM access to weights and intermediate data. Computing-in-memory (CIM) improves efficiency by enabling parallel computing, reducing memory accesses, and suppressing intermediate data. Nonetheless, three critical challenges remain (Fig. 31.5.2), particularly for FCNL. We overcome these problems by co-optimizing the circuits and the system. Recently, researches have been focusing on XNOR based binary-DNN structures [6]. Although they achieve a slightly higher accuracy, than other binary structures, they require a significant hardware cost (i.e. 8T-12T SRAM) to implement a CIM system. To further reduce the hardware cost, by using 6T SRAM to implement a CIM system, we employ binary DNN with 0/1-neuron and  $\pm 1$ -weight that was proposed in [7]. We implemented a 65nm 4Kb algorithm-dependent CIM-SRAM unit-macro and in-house binary DNN structure (focusing on FCNL with a simplified PE array), for cost-aware DNN AI edge processors. This resulted in the first binary-based CIM-SRAM macro with the fastest (2.3ns) PS operation, and the highest energy-efficiency (55.8TOPS/W) among reported CIM macros [3-4].

Figure 31.5.2 presents the CIM-SRAM unit macro. In inference operations input data (IN) is converted into multiple WL activations. The weights (W) of each  $n \times n$  CNVL kernel or m-weight FCNL are stored in consecutive  $n^2/m$  cells on the same BL. When WL=IN=1, the read current ( $I_{MC}$ ) of each activated memory-cell (MC) represents its input-weight-product (IN×W); the resulting BL voltage ( $V_{BL}$ ) is the sum of IN×W. Unlike the BL-discharge approach in [3] and typical SRAM, we adopted a voltage-divider (VD) approach to read PS results. For a typical 6T CIM-SRAM the charge ( $I_{MC-C}$ ) and discharge ( $I_{MC-D}$ ) cell currents both develop on BL/BLB, since both pass-gates (PGL/PGR) are activated. A large number of WL activations ( $N_{WL}$ ) result in high BL current ( $I_B = n^2(I_{MC-C} + I_{MC-D})$  or  $m(I_{MC-C} + I_{MC-D})$ ). Since  $m$  (i.e. 64) is usually much larger than  $n^2$  (i.e. 9 for a 3x3 kernel), the CIM-SRAM for FCNL faces more difficult challenges in circuit designs than that for CNVL. Thus, this work focuses on CIM-SRAM for FCNLs.

The second challenge is inefficient binary-PS result or winner detection in FCNL. The regular and last-layer in FCNLs use a reference voltage ( $V_{REF}$ ) to identify the PS result ( $\pm 1$ ) or winner ( $V_{1ST}$ ) from the second ( $V_{2ND}$ ) candidates. In single-ended SRAM sensing, the  $V_{REF}$  for sense amplifiers (SA) is a fixed value. For However, simulations of PS result distributions for  $V_{1ST}$  and  $V_{2ND}$  using the MNIST database show that the ideal  $V_{REF}$  covers a wide range (>0.4V). Even with perfect SA, 5-to-6-sensing iterations are needed to approach the accuracy limit of the binary algorithm.

The third challenge is small voltage sensing margins ( $V_{SM}$ ) across different PS results on FCNLs. Three techniques are proposed to overcome these difficulties: (1) algorithm-dependent asymmetric control (ADAC), (2) dynamic input-aware  $V_{REF}$  generation (DIARG), and (3) a common-mode-insensitive (CMI) small-offset voltage-mode sense amplifier (VSA).

Data pattern analysis of MNIST test images revealed an intriguing asymmetry between the number of  $IN \times W = +1$  ( $N_{+1}$ ) and  $(IN \times W) = -1$  ( $N_{-1}$ ) on a BL in the last two FCNLs (i.e. the q<sup>1</sup> and q<sup>th</sup> layers). This is a generic characteristic among various applications, because IN×W results in the last layer are already polarized to have a single candidate that is most probable. Also, the  $N_{+1}/N_{-1}$  asymmetry is opposed in the last two FCNLs. We used these characteristics to reduce  $I_{BL}$  and macro power consumption using an newly proposed ADAC scheme combining the previous split-WL DSC6T [8] cell. This allows for different WL/BL access modes for two layers using the same CIM-SRAM unit-macro.

The ADAC scheme (Fig. 31.5.3) comprises of an asymmetric-flag (AF), BL-selection switches (BLSW), WL-selection switches (WLSW), dual-path output-drivers (DPOD), BL-clamping (BLC) and DSC6T cells. AF can be pre-defined during training, or

configured by an application, to specify whether to use WLL-BLL or WLR-BLR for sensing. It is determined by  $N_{+1}$  and  $N_{-1}$  of all the BLs in the macro ( $N_{+1,M}$  and  $N_{-1,M}$ ). For ( $N_{+1,M} > N_{-1,M}$ ), AF is asserted for WLL-BLL sensing. WLSW activates BLL sensing by asserting only the WLLs of the selected rows ( $IN=1$ ), while all WLRs are grounded. Each BLL is connected to its corresponding VSA through BLSW, while BLR=VDD is isolated from the VSA. The VSA detects  $V_{BL}$  and directs its output (SAOUT) through the non-inversion path of DPOD to DOUT. For AF=0 ( $N_{+1} < N_{-1}$ ), WLR-BLR sensing is selected, the roles of WLR-BLR and WLL-BLL are switched, and the SAOUT is sent through the inversion path of DPOD to DOUT. Thus, ADAC+DSC6T consumes less  $I_{BL}$  and macro power is reduced compared to a typical 6T cell due to (a) less parasitic load on WLL/WLR (1T per cell), (b) less  $I_{BL}$  on the selected BL, and (c) no  $I_{BL}$  from unselected BL. In MNIST simulations, ADAC+DSC6T scheme consumed 61.4% less current than a conventional 6T SRAM. To avoid the write disturbance, we also employed BL-clamping (BLC), which prevents  $V_{BL}$  from dropping below the cell-write threshold voltage.

Figure 31.5.4 presents the DIARG scheme, where  $V_{REF}$  generation is based on  $N_{WL}$  for CNVL and FCNL modes of binary DNN. DIARG includes columns (RC1 and RC2) of fixed-zero (Q=0) reference-cells (FORC), a BL-header (BLH), a WL-combiner (WLCB), a reference-WL-tuner (RWLT), and a replica BL-selection switch (RBLSW). WLCB combines the WLL/WLR of a regular array with a reference WL (RC1WL) for RC1, such that RC1WL is asserted when WLL=1 or WLR=1. The BLL and BLR of RC1 are shorted together; therefore, when  $N_{WL}$  rows are activated, RC1 always provides the  $V_{REF}$  resulting from  $N_{WL}(I_{MC-D} + I_{MC-C})$ . The reference WL (RC2WL) for RC2 is controlled by RWLT to adjust  $I_{MC-D}$  and  $I_{MC-C}$  on BLL2/BLR2 for multiple-level or multi-iteration sensing. With RBLSW connecting BLL2/BLR2 to BLL1/BLR1, the required  $V_{REF}$  is a function of  $N_{WL}$ . In our example, RC1 alone (RC2 is decoupled) provides the required last-layer FCNL  $V_{REF}$  for MNIST applications. DIARG provides winner-detection accuracy of 97.3% in the first iteration, whereas conventional fixed- $V_{REF}$  requires four iterations. This reduces latency and energy overhead by over 4x.

We use a CMI-VSA to tolerate a small BL signal margin ( $V_{SM}$ ) against a wide  $V_{BL}$  common-mode ( $V_{BL-CM}$ ) range across various PS results (Fig. 31.5.5). The CMI-VSA includes two cross-coupled inverters (INV-L, INV-R), two capacitors (C1, C2), and eight switches (SW1-SW8) for auto-zeroing and margin enhancement. The CMI-VSA provides a 2.5x improvement in offset over a conventional VSA and a constant sensing delay across the  $V_{BL-CM}$  range. In standby mode, CMI-VSA latches the previous result. In phase-1 (PH1), control switches enable the two inverters (INV-L and INV-R) to auto-zero at their respective trigger points ( $V_{TRP-L}$  and  $V_{TRP-R}$ ), making the node voltages  $V_{INV}$  and  $V_{INR}$  equal to  $V_{BL}$  and  $V_{REF}$ . In PH2, ( $V_{BL}-V_{REF}$ ) and ( $V_{REF}-V_{BL}$ ) are respectively coupled to  $V_{CL}$  and  $V_{CR}$ . This ideally increases the difference in voltage ( $V_{INV}$ ) between  $V_{CL}$  and  $V_{CR}$  to  $2(V_{BL}-V_{REF})$ . In PH3, INV1 and INV2 amplify  $V_{INV}$  to generate full swing for  $V_{CL}$  and  $V_{CR}$ .

Figure 31.5.6 presents the measured results from a test chip with multiple  $64 \times 64$  CIM-SRAM unit-macros, integrated with the last-two FCNLs and test-modes. For the last FCNLs, the macro access time ( $t_{AC-M}$ ) is 2.3ns for MINST winner detection at  $V_{DD}=1V$ . For the integrated last-two FCN layers, the  $t_{AC-M-2Layers}=4.8ns$  for MNIST image identification. In a shmoof test, ADAC and CIM-VSA enabled support for  $V_{WL}=0.8V$  at  $V_{DD}=1V$  to suppress  $I_{BL}$ . The CIM macro achieved the fastest PS operations and the highest energy efficiency among CIM-SRAMs; i.e. over 4.17x faster than the previous CIM-SRAM [3]. Figure 31.5.7 presents the die photograph.

#### Acknowledgements:

The authors would like to thank TSMC-JDP, MTK-JDP, MOST-Taiwan for their support.

#### References:

- [1] K. Bong, et al., "A 0.62mW Ultra-Low-Power Convolutional-Neural-Network Face-Recognition Processor and a CIS Integrated with Always-On Haar-Like Face Detector", *ISSCC*, pp. 344-346, 2017.
- [2] M. Price, et al., "A Scalable Speech Recognizer with Deep-Neural-Network Acoustic Models and Voice-Activated Power Gating," *ISSCC*, pp. 244-245, 2017.
- [3] J. Zhang, et al., "A Machine-learning Classifier Implemented in a Standard 6T SRAM Array," *IEEE Symp. VLSI Circuits*, 2016.
- [4] F. Su, et al., "A 462GOPS/J RRAM-Based Nonvolatile Intelligent Processor for Energy Harvesting IoE System Featuring Nonvolatile Logics and Processing-In-Memory," *IEEE Symp. VLSI Circuits*, pp. 260-261, 2017.
- [5] M. Courbariaux, et al., "Binarynet: Training deep neural networks with weights and activations constrained to +1 or -1," *ArXiv: 1602.02830*, 2016, <https://arxiv.org/abs/1602.02830>.
- [6] M. Rastegari, et al., "XNOR-net: ImageNet classification using binary convolutional neural networks," *ArXiv: 1603.05279*, 2016, <https://arxiv.org/abs/1603.05279>.
- [7] M. Kim, et al., "Bitwise neural networks," *Int. Conf. on Machine Learning Workshop on Resource-Efficient Machine Learning*, 2015.
- [8] M.-F. Chang, et al., "A 28nm 256Kb 6T-SRAM with 280mV Improvement in VMIN Using a Dual-Split-Control Assist Schemem" *ISSCC*, pp. 314-315, 2015.



Figure 31.5.1: CIM-SRAM for AI edge processors.



Figure 31.5.2: Proposed CIM-SRAM structure and challenges.



Figure 31.5.3: Algorithm-dependent asymmetric control (ADAC) scheme.

Figure 31.5.4: Dynamic input-aware V<sub>REF</sub> generation (DIARG) scheme.

Figure 31.5.5: Operation of CMI-VSA.



Figure 31.5.6: Measured results.



Figure 31.5.7: Die photo and summary table.

# ISSCC GLOSSARY

| 1                                  |                                         | B             |                                                                 |
|------------------------------------|-----------------------------------------|---------------|-----------------------------------------------------------------|
| <b>1P1M</b>                        | 1-Polysilicon layer 1-Metal layer       | <b>BAN</b>    | Business Area Network                                           |
| <b>1T1C</b>                        | 1-Transistor 1-Capacitor                | <b>BAW</b>    | Bulk Acoustic Wave                                              |
| 3                                  |                                         | Baseband      |                                                                 |
| <b>3D</b>                          | 3-Dimensional                           | <b>BBT</b>    | Band-to-Band Tunneling                                          |
| <b>3G</b>                          | Third-Generation (Wireless)             | <b>BCD</b>    | Bipolar-CMOS-DMOS Process                                       |
| <b>3T</b>                          | 3-Transistor                            | <b>BCD</b>    | Binary-Coded Decimal                                            |
| 6                                  |                                         | <b>BCH</b>    | Bose-Chaudhuri-Hocquenghem<br>(a type of error-correcting code) |
| <b>6T</b>                          | 6-Transistor                            | <b>BD</b>     | Blu-ray disc                                                    |
| 8                                  |                                         | <b>BER</b>    | Bit-Error Rate                                                  |
| <b>8T</b>                          | 8-Transistor                            | <b>BGA</b>    | Ball-Grid Array                                                 |
| $\Delta\Sigma$                     |                                         | <b>BGR</b>    | Band-Gap Reference                                              |
| <b><math>\Delta\Sigma</math></b>   | Delta-Sigma                             | <b>BiCMOS</b> | Bipolar Complementary-MOS                                       |
| <b><math>\Delta\Sigma M</math></b> | Delta-Sigma Modulator                   | <b>BIOS</b>   | Basic Input/Output System                                       |
| $\Sigma\Delta$                     |                                         | <b>BIST</b>   | Built-in Self-Test                                              |
| <b><math>\Sigma\Delta</math></b>   | $\Delta\Sigma$ is preferred.            | <b>BJT</b>    | Bipolar Junction Transistor                                     |
| <b><math>\Sigma\Delta M</math></b> | $\Delta\Sigma M$ is preferred.          | <b>BL</b>     | Bitline                                                         |
| A                                  |                                         | <b>BLE</b>    | Bluetooth Low Energy                                            |
| <b>a-Si</b>                        | Amorphous Silicon                       | <b>BOM</b>    | Bill of Materials                                               |
| <b>AC</b>                          | Alternating Current                     | <b>BPF</b>    | Bandpass Filter                                                 |
| <b>A/D</b>                         | Analog-to-Digital Converter             | <b>BPSK</b>   | Binary Phase-Shift Keying                                       |
| <b>AAC</b>                         | Advanced Audio Coding                   | <b>BSI</b>    | BackSide Illumination                                           |
| <b>ACI</b>                         | Adjacent-Channel Interface              | <b>B-VOP</b>  | Bidirectional-Video Object Planes                               |
| <b>ACL</b>                         | Access Control List                     | <b>BW</b>     | Bandwidth                                                       |
| <b>ACLR</b>                        | Adjacent-Channel Leakage Power Ratio    | C             |                                                                 |
| <b>ACPR</b>                        | Adjacent-Channel Power Ratio            | <b>C4</b>     | Controlled-Collapse Chip Connection                             |
| <b>ADC</b>                         | Analog-to-Digital Converter             | <b>CAD</b>    | Computer-Aided Design                                           |
| <b>ADDLL</b>                       | All-Digital DLL                         | <b>CAM</b>    | Content-Addressable Memory                                      |
| <b>ADPLL</b>                       | All-Digital PLL                         | <b>CAN</b>    | Controller Area Network                                         |
| <b>ADSL</b>                        | Asynchronous Digital Subscriber line    | <b>CAS</b>    | Column-Address Strobe                                           |
| <b>ADU</b>                         | Analog-to-Digital Unit                  | <b>CCCS</b>   | Current-Controlled Current Source                               |
| <b>AES</b>                         | Advanced Encryption Standard            | <b>CCD</b>    | Charge-coupled Device                                           |
| <b>AFC</b>                         | Automatic Frequency Control             | <b>CCK</b>    | Complementary Code Keying                                       |
| <b>AFE</b>                         | Analog Front End                        | <b>CCO</b>    | Current-Controlled Oscillator                                   |
| <b>AFM</b>                         | Adaptive Flash Management™              | <b>CCVS</b>   | Current-Controlled Voltage Source                               |
| <b>AGC</b>                         | Automatic Gain Control                  | <b>CDAC</b>   | Capacitor DAC                                                   |
| <b>AGU</b>                         | Address-Generation Unit                 | <b>CDMA</b>   | Code-Division Multiple Access                                   |
| <b>AIP</b>                         | Artificial-Intelligent Partner          | <b>CDR</b>    | Clock and Data Recovery                                         |
| <b>ALU</b>                         | Arithmetic Logic Unit                   | <b>CDS</b>    | Correlated Double Sampling                                      |
| <b>AM</b>                          | Amplitude Modulation                    | <b>CF</b>     | Compact Flash                                                   |
| <b>AMI</b>                         | Advanced Metering Infrastructures       | <b>CFA</b>    | Color Filter Array                                              |
| <b>AMLCD</b>                       | Active-Matrix LCD                       | <b>CFL</b>    | Compact Fluorescent Lamp                                        |
| <b>AMOLED</b>                      | Active-Matrix OLED                      | <b>CHE</b>    | Channel Hot-Electron (Injection)                                |
| <b>AMP</b>                         | Asymmetric Multi-Processing             | <b>CIS</b>    | Complex-Instruction-Set Computer                                |
| <b>AMPS</b>                        | Advanced Mobile-Phone Service           | <b>CIS</b>    | CMOS Image Sensor                                               |
| <b>AMS</b>                         | Analog Mixed-Signal {System}            | <b>CLI</b>    | Command Line Interface                                          |
| <b>APD</b>                         | Avalanche Photo-Diode                   | <b>CML</b>    | Current-Mode Logic                                              |
| <b>APG</b>                         | Algorithmic Pattern Generator           | <b>CMOS</b>   | Complementary Metal-Oxide Semiconductor                         |
| <b>API</b>                         | Application-Programming Interface       | <b>CMRR</b>   | Common-Mode Rejection Ratio                                     |
| <b>APSK</b>                        | Amplitude Phase-Shift Keying            | <b>CMU</b>    | Clock Multiplier Unit                                           |
| <b>ARM</b>                         | Advanced RISC Machine                   | <b>CMUT:</b>  | Capacitive micromachined ultrasonic transducer                  |
| <b>ASIC</b>                        | Application-Specific Integrated Circuit | <b>CNFET</b>  | Carbon Nanotube NFET                                            |
| <b>ASK</b>                         | Amplitude Shift Keying                  | <b>CO</b>     | Central-Office (hardware)                                       |
| <b>ASP</b>                         | Average Selling Price                   | <b>CODEC</b>  | Coder-Decoder                                                   |
| <b>ASP</b>                         | Advanced Simple Profile (MPEG-4 Video)  | <b>COFDM</b>  | Coded FDM                                                       |
| <b>ATA</b>                         | Advanced Technology Attachment          | <b>CoMP</b>   | Coordinated MultiPoint                                          |
| <b>ATD</b>                         | Address-Transition Detection            | <b>CPE</b>    | Customer Premises Equipment                                     |
| <b>ATE</b>                         | Automatic Test Equipment                | <b>CPU</b>    | Central Processing Unit                                         |
| <b>ATM</b>                         | Asynchronous Transfer Mode              | <b>CPW</b>    | CoPlanar Waveguide                                              |
| <b>ATSC</b>                        | Advanced Television Systems Committee   | <b>CRC</b>    | Cyclic Redundancy Check                                         |
| <b>AVC</b>                         | Audio-Visual CODEC                      | <b>CSMA</b>   | Carrier-Sense Multiple Access                                   |
| <b>AVC</b>                         | Automatic Volume Control (use AGC)      | <b>CT</b>     | Continuous Time (system)                                        |
| <b>AWG</b>                         | Arrayed-Waveguide Grating               | <b>CUI</b>    | Command User Interface                                          |
|                                    |                                         | <b>CVD</b>    | Chemical Vapor Deposition                                       |

# ISSCC GLOSSARY

| D       |                                                               | E          |                                                                               |
|---------|---------------------------------------------------------------|------------|-------------------------------------------------------------------------------|
| D/A     | Digital-to-Analog Converter                                   | ECC        | Error-Correcting Code                                                         |
| DAB     | Digital-Audio Broadcasting                                    | ECG        | Electro-CardioGram                                                            |
| DAC     | Digital-to-Analog Converter                                   | ECL        | Emitter-Coupled Logic                                                         |
| dBFS    | dB relative to Full Scale                                     | ECP        | Emitter-Coupled Pair                                                          |
| DBS     | Direct-Broadcast Satellite                                    | EDGE       | Enhanced Data rates for Global Evolution                                      |
| DCC     | Duty-Cycle Corrector                                          | EDR        | Enhanced Data Rate                                                            |
| DCO     | Digitally Controlled Oscillator                               | EEG        | Electro-EncephaloGram                                                         |
| DCT     | Discrete Cosine Transform                                     | EEPROM     | Electrically Erasable Programmable                                            |
| DCVS    | Differential Cascode Voltage Switch                           | EFR        | Read-Only Memory                                                              |
| DCVS    | Digitally-Controlled Voltage Source                           | EIRP       | Enhanced Full-Rate (GSM)                                                      |
| DCXO    | Digitally Controlled Crystal Oscillator                       | EKG        | Effective Isotropic Radiated Power                                            |
| DDFS    | Direct-Digital Frequency Synthesis<br>(or synthesizer)        | EMG        | Electro-CardioGram (see ECG)                                                  |
| DDR     | Dual Data Rate                                                | EMI        | ElectroMyoGram                                                                |
| DDS     | Direct Digital Synthesis                                      | eICIC      | Electro Magnetic Interference                                                 |
| DECT    | Digitally Enhanced Cordless Communication                     | ENOB       | enhancement in Inter-Cell Interference Coordination                           |
| DEM     | DEModulator                                                   | EOT        | Effective Number of Bits                                                      |
| DEM     | Dynamic Element Matching                                      | EPON       | Electrical Oxide Thickness                                                    |
| DEMONS  | Depletion MOS                                                 | EPROM      | Ethernet-based Passive Optical Network                                        |
| DEMUX   | Demultiplexer                                                 | ERBW       | Erasable Programmable Read-Only Memory                                        |
| DES     | Data-Encryption Standard                                      | ESD        | Effective-Resolution Bandwidth                                                |
| DEVM    | Differential Error Vector Magnitude                           | EUV        | ElectroStatic Discharge                                                       |
| DFE     | Decision-Feedback Equalizer                                   | EVDO       | Extreme UltraViolet                                                           |
| DFF     | D-type Flip Flop                                              | EVM        | EVolution Data Optimized (in the context of CDMA)                             |
| DFT     | Design for Testability                                        | EWC        | Error-Vector Magnitude                                                        |
| DFT     | Discrete Fourier Transform                                    |            | Enhanced Wireless Consortium                                                  |
| DfY     | Design-for-Yield                                              |            |                                                                               |
| DIMM    | Dual In-Line Memory Module                                    | <b>f</b>   |                                                                               |
| DIP     | Dual In-line Package                                          | $f_{\max}$ | Unity power gain frequency.                                                   |
| DLL     | Delay-Locked Loop                                             | $f_s$      | Sampling frequency                                                            |
| DMA     | Direct Memory Access                                          | $f_t$      | Transit frequency                                                             |
| DMB     | Digital Multimedia Broadcasting                               | FAMOS      | Floating-gate Avalanche-injection MOS Transistor                              |
| DMIPS   | Dhrystone Million Instructions Per Second                     | FAN        | Field Area Network                                                            |
| DMOS    | (Double-)Diffused MOS                                         | FBAR       | Film Bulk Acoustic Resonator                                                  |
| DNA     | Deoxyribonucleic Acid                                         | FBDIMM     | Fully Buffered DIMM                                                           |
| DNL     | Differential Non-Linearity                                    | FCC        | Federal Communications Commission (U.S.)                                      |
| DNR     | Dynamic Range (DR preferred)                                  | FDM        | Frequency-Division Multiplexing                                               |
| DPT     | Double Patterning Technology                                  | FDMA       | Frequency-Division Multiple-Access                                            |
| DOM     | Digital Optical Module                                        | FDNR       | Frequency-Dependent Negative Resistor                                         |
| DR      | Dynamic Range (see also DNR)                                  | FDSOI      | Fully Depleted Silicon-on-Insulator                                           |
| DRAM    | Dynamic Random-Access Memory                                  | FEC        | Forward Error Checking                                                        |
| DRC     | Design-Rule Check                                             | FEM        | Front End Module                                                              |
| DSB     | Double Side Band                                              | FeRAM      | Ferro-electric Random Access Memory                                           |
| DSC     | Digital Still Camera                                          | FET        | Field-Effect Transistor                                                       |
| DSL     | Digital Subscriber Line                                       | FF         | Flip-Flop                                                                     |
| DS-OFDM | Direct-Sequence Orthogonal Frequency<br>Division Multiplexing | FFC        | Flexible-Flat Cable                                                           |
| DSP     | Digital Signal Processing                                     | FFE        | Feed-Forward Equalizer                                                        |
| DSSS    | Direct-Sequence Spread-Spectrum                               | FFT        | Fast Fourier Transform                                                        |
| DT      | Discrete Time                                                 | FIB        | Focused Ion Beam                                                              |
| DTL     | Diode-Transistor Logic                                        | FIFO       | First In-First Out                                                            |
| DTV     | Digital Television                                            | FinFET     | A MOSFET with the gate on two sides<br>{acronym describes the physical shape} |
| DUT     | Device Under Test                                             | FIR        | Finite Impulse Response (filter)                                              |
| DVB     | Digital-Video Broadcasting                                    | FLOPS      | Floating-Point Operations Per Second                                          |
| DVB-C   | Digital-Video Broadcasting - Cable                            | FLOTOX     | Floating-gate Tunnel OXide                                                    |
| DVB-H   | Digital-Video Broadcasting - Handhelds                        | FM         | Frequency Modulation                                                          |
| DVB-S   | Digital-Video Broadcasting - Satellite                        | FMCW       | Frequency Modulated Continuous Wave                                           |
| DVB-T   | Digital-Video Broadcasting - Terrestrial                      | FN         | Fowler-Nordheim                                                               |
| DVD     | Digital Video Disc                                            | FO4        | Fan-Out of 4                                                                  |
| DVFS    | Dynamic Voltage Frequency Scaling                             | FOM        | Figure Of Merit                                                               |
| DVS     | Dynamic Voltage Scaling                                       | FPGA       | Field-Programmable Gate Array                                                 |
| DWA     | Data Weighted Averaging                                       | FPN        | Fixed Pattern Noise                                                           |
| DWDM    | Dense-Wavelength-Division Multiplexing                        | FPU        | Floating Point Unit                                                           |
| DWMT    | Discrete Wavelet Multi-Tone                                   | FSG        | FluoroSilicate glass (dielectric)                                             |
|         |                                                               | FSK        | Fluorine-doped Silicate Glass                                                 |
|         |                                                               | FSM        | Frequency-Shift Keying                                                        |
|         |                                                               |            | Finite-State Machine                                                          |

# ISSCC GLOSSARY

## G

|               |                                           |
|---------------|-------------------------------------------|
| <b>GAA</b>    | Gate-All-Around                           |
| <b>GaN</b>    | Gallium Nitride                           |
| <b>GBW</b>    | Gain-BandWidth (product)                  |
| <b>GCA</b>    | Gain-Controlled Amplifier                 |
| <b>GDDR</b>   | Graphics Double-Data-Rate                 |
| <b>GDM</b>    | Generalized Design-for-Manufacturability  |
| <b>GE-PHY</b> | Gigabit-Ethernet Physical                 |
| <b>GFSK</b>   | Gaussian Frequency-Shift Keying           |
| <b>GFLOPS</b> | Giga Floating-Point Operations Per Second |
| <b>GIDL</b>   | Gate-Induced Drain Leakage                |
| <b>GMSK</b>   | Gaussian Minimum-Shift Keying             |
| <b>GOPS</b>   | Giga-Operations Per Second                |
| <b>GPRS</b>   | General Packet-Radio Service              |
| <b>GPS</b>    | Global Positioning System                 |
| <b>GPU</b>    | Graphic Processing Unit                   |
| <b>GSM</b>    | Global Standard for Mobile Communication  |
| <b>GUI</b>    | Graphical User Interface                  |
| <b>GVCO</b>   | Gated VCO                                 |

## H

|               |                                           |
|---------------|-------------------------------------------|
| <b>HBT</b>    | Hetero-junction Bipolar Transistor        |
| <b>HBM</b>    | High-Bandwidth Memory                     |
| <b>HBM</b>    | Human Body Model                          |
| <b>HCI</b>    | Host-Controller Interface                 |
| <b>HCI</b>    | Human-to-Computer Interface               |
| <b>HD</b>     | High-Density                              |
| <b>HDD</b>    | Hard-Disk-Drive                           |
| <b>HDL</b>    | Hardware-Description Language             |
| <b>HDTV</b>   | High-Definition TeleVision                |
| <b>HD2</b>    | 2 <sup>nd</sup> order Harmonic Distortion |
| <b>HD3</b>    | 3 <sup>rd</sup> order Harmonic Distortion |
| <b>HiFi</b>   | High Fidelity                             |
| <b>Hk</b>     | High-k dielectric                         |
| <b>HMD</b>    | Head Mounted Display                      |
| <b>HPF</b>    | High-Pass Filter                          |
| <b>HSPDA</b>  | High-Speed Downlink Packet Access         |
| <b>HT3</b>    | Hypertransport 3 (I/O standard)           |
| <b>HTM</b>    | Hierarchical Temporal Memory              |
| <b>HV</b>     | High Voltage                              |
| <b>HVAC</b>   | Heating, Ventilation and Air Conditioning |
| <b>HVCMOS</b> | High-Voltage Complementary MOS            |
| <b>HVMOS</b>  | High-Voltage MOS                          |

## I

|               |                                                             |
|---------------|-------------------------------------------------------------|
| <b>I/O</b>    | Input-Output                                                |
| <b>I/Q</b>    | In Phase and Quadrature                                     |
| <b>IAN</b>    | Industrial Area Network                                     |
| <b>IBOC</b>   | In-Band Out-of-Channel                                      |
| <b>IC</b>     | Integrated Circuit                                          |
| <b>ID</b>     | Identification                                              |
| <b>IF</b>     | Intermediate Frequency                                      |
| <b>IGBT</b>   | Insulated Gate Bipolar Transistor                           |
| <b>IIP2</b>   | Input-referred Input 2 <sup>nd</sup> -order Intercept Point |
| <b>IIP3</b>   | Input-referred Input 3 <sup>rd</sup> -order Intercept Point |
| <b>IIR</b>    | Infinite Impulse Response Filter                            |
| <b>IMD</b>    | Inter-Modulation Distortion                                 |
| <b>IM2</b>    | 2 <sup>nd</sup> -order InterModulation distortion           |
| <b>IM3</b>    | 3 <sup>rd</sup> -order InterModulation distortion           |
| <b>INL</b>    | Integral Non-Linearity                                      |
| <b>InP</b>    | Indium Phosphide                                            |
| <b>IoT</b>    | Internet of Things                                          |
| <b>IP</b>     | Intellectual Property                                       |
| <b>IPC</b>    | Inter-Process Communication                                 |
| <b>IPSEC</b>  | Internet (Network) Protocol for Security                    |
| <b>IR</b>     | Image Rejection                                             |
| <b>IR</b>     | InfraRed                                                    |
| <b>ISDB</b>   | Integrated Services Digital Broadcasting                    |
| <b>ISDB-C</b> | Integrated Services Digital Broadcasting - Cable            |

|               |
|---------------|
| <b>ISDB-S</b> |
| <b>ISDB-T</b> |
| <b>ISFET</b>  |
| <b>ISI</b>    |
| <b>ISM</b>    |
| <b>ITU-T</b>  |

|                                                        |
|--------------------------------------------------------|
| Integrated Services Digital Broadcasting - Satellite   |
| Integrated Services Digital Broadcasting - Terrestrial |
| Ion-Sensitive Field-Effect Transistor                  |
| Inter-Symbol Interference                              |
| Industrial, Scientific and Medicine Band               |
| International Telecommunications Union                 |

## J

|             |
|-------------|
| <b>JPEG</b> |
| <b>JTAG</b> |

## K

|            |
|------------|
| <b>KGD</b> |
|------------|

## L

|               |
|---------------|
| <b>LAGS</b>   |
| <b>LAN</b>    |
| <b>LBS</b>    |
| <b>LCD</b>    |
| <b>LCOS</b>   |
| <b>LDI</b>    |
| <b>LDCMOS</b> |
| <b>LDMOS</b>  |
| <b>LDO</b>    |
| <b>LDPC</b>   |
| <b>LED</b>    |
| <b>LEP</b>    |
| <b>LFCSP</b>  |
| <b>LFSR</b>   |
| <b>LHC</b>    |
| <b>LIN</b>    |
| <b>LMS</b>    |
| <b>LNA</b>    |
| <b>LNB</b>    |
| <b>LO</b>     |
| <b>LPCVD</b>  |
| <b>LPF</b>    |
| <b>LRU</b>    |
| <b>LSB</b>    |
| <b>LSI</b>    |
| <b>LTE</b>    |
| <b>LTPS</b>   |
| <b>LV</b>     |
| <b>LVDS</b>   |
| <b>LVS</b>    |

## M

|                |
|----------------|
| <b>MAC</b>     |
| <b>MAC</b>     |
| <b>MASH</b>    |
| <b>MBOA</b>    |
| <b>MB-OFDM</b> |
| <b>MCM</b>     |
| <b>MCP</b>     |
| <b>MCU</b>     |
| <b>MCU</b>     |
| <b>MDAC</b>    |
| <b>MEMS</b>    |
| <b>MER</b>     |
| <b>MfD</b>     |
| <b>MPU</b>     |
| <b>microSD</b> |
| <b>MICS</b>    |
| <b>MIM</b>     |
| <b>MIMO</b>    |
| <b>MIPI</b>    |
| <b>MIPS</b>    |
| <b>MISR</b>    |
| <b>ML</b>      |

# ISSCC GLOSSARY

|               |                                                    |               |                                                                |
|---------------|----------------------------------------------------|---------------|----------------------------------------------------------------|
| <b>MLC</b>    | Multi-Level Cell                                   | <b>PCB</b>    | Printed-Circuit Board                                          |
| <b>MLSD</b>   | Maximum-Likelihood-Sequence Detection              | <b>PCH</b>    | Platform Controller Hub                                        |
| <b>MLSE</b>   | Maximum-Likelihood-Sequence Estimation             | <b>PCI</b>    | Peripheral-Component Interconnect                              |
| <b>MMAC</b>   | Mega Mutliply-Accumulate                           | <b>PCI-X</b>  | PCI Express                                                    |
| <b>MMIC</b>   | Monolithic Microwave Integrated Circuit            | <b>PCM</b>    | Pulse-Code Modulation                                          |
| <b>MMIO</b>   | Memory-Mapped I/O                                  | <b>PCS</b>    | Personal Communication Services                                |
| <b>mmW</b>    | millimeter-Wave                                    | <b>PCU</b>    | Power-management Control Unit                                  |
| <b>MODEM</b>  | Modulator-Demodulator                              | <b>PD</b>     | Phase Detector                                                 |
| <b>MOS</b>    | Metal-Oxide-Semiconductor (Silicon)                | <b>PD-SOI</b> | Partially-Depleted SOI                                         |
| <b>MOSFET</b> | Metal Oxide Semiconductor Field-Effect Transistor  | <b>PDA</b>    | Personal Data Assistant                                        |
| <b>MOST</b>   | MOS Transistor                                     | <b>PDN</b>    | Power-Delivery Network                                         |
| <b>MOST</b>   | Media-Oriented Systems Transport                   | <b>PDP</b>    | Power-Delay Product                                            |
| <b>MP</b>     | Multi-Processor                                    | <b>PFD</b>    | Phase and Frequency Detector                                   |
| <b>MP3</b>    | MPEG-1 audio layer 3 (lossy compression algorithm) | <b>PGA</b>    | Programmable-Gain Amplifier                                    |
| <b>MPEG</b>   | Motion-Picture Expert Group                        | <b>PGA</b>    | Programmable Gate Array                                        |
| <b>Mpps</b>   | Million packets per second (VoIP)                  | <b>PHEMT</b>  | Pseudomorphic High-Electron-Mobility Transistor                |
| <b>MPPT</b>   | Maximum-Power-Point Tracker                        | <b>PHEVs</b>  | Plug-in Hybrid Electric Vehicles                               |
| <b>MPU</b>    | Microprocessors                                    | <b>PHY</b>    | PHYsical layer (of a communications protocol)                  |
| <b>MSB</b>    | Most Significant Bit                               | <b>PID</b>    | Proportional, Integral, Derivative<br>(a type of control loop) |
| <b>MRAM</b>   | Magnetic Random-Access Memory                      | <b>PLA</b>    | Programmable Logic Array                                       |
| <b>MRAM</b>   | Magnetoresistive Random-Access Memory              | <b>PLC</b>    | Power-Line Communication                                       |
| <b>MRC</b>    | Maximum-Ratio Combining                            | <b>PLD</b>    | Programmable Logic Device                                      |
| <b>MSB</b>    | Most-Significant Bit                               | <b>PLL</b>    | Phase-Locked Loop                                              |
| <b>MTJ</b>    | Magnetic Tunnel Junction                           | <b>PMOS</b>   | P-channel MOS                                                  |
| <b>MTPR</b>   | Multi-Tone Power Ratio                             | <b>PMOST</b>  | PMOS transistor                                                |
| <b>MUX</b>    | Multiplexer                                        | <b>PMU</b>    | Power Management Unit                                          |
| <b>MWPC</b>   | Multi-Wire Proportional Chamber                    | <b>PNP</b>    | P-type-N-type-P-type bipolar (transistor)                      |

## N

|              |                                                            |             |                                      |
|--------------|------------------------------------------------------------|-------------|--------------------------------------|
| <b>NAICS</b> | Network Assisted Interference Cancellation and Suppression | <b>PoP</b>  | Package-on-Package                   |
| <b>NAN</b>   | Neighbourhood Area Network                                 | <b>POTS</b> | Plain-Old Telephone Service          |
| <b>NBTI</b>  | Negative-Bias Temperature Instability                      | <b>ppm</b>  | parts per million                    |
| <b>NEF</b>   | Noise Efficiency Factor                                    | <b>PPM</b>  | Pulse-Position Modulation            |
| <b>NEM</b>   | Nano-Electro-Mechanical system                             | <b>PR</b>   | Pseudo Random                        |
| <b>NF</b>    | Noise Figure                                               | <b>PR</b>   | Partial Response                     |
| <b>NFV</b>   | Network Function Virtualization                            | <b>PRAM</b> | Phase-Change RAM                     |
| <b>NMOS</b>  | N-channel MOS                                              | <b>PRBS</b> | Pseudo-Random Binary Sequence        |
| <b>NMOST</b> | NMOS Transistor                                            | <b>PRML</b> | Partial-Response, Maximum-Likelihood |
| <b>NoC</b>   | Network on (a) Chip                                        | <b>PROM</b> | Programmable Read-Only Memory        |
| <b>NPN</b>   | N-type-P-type-N-type bipolar (transistor)                  | <b>PSD</b>  | Power Spectral Density               |
| <b>NRTZ</b>  | Non Return-To-Zero (see also NRZ)                          | <b>PSK</b>  | Phase-Shift Keying                   |
| <b>NRZ</b>   | Non-Return- to-Zero (see also NRTZ)                        | <b>PSNR</b> | Peak SNR                             |
| <b>NTF</b>   | Noise Transfer Function                                    | <b>PSRR</b> | Power-Supply Rejection Ratio         |
| <b>NVM</b>   | Non-Volatile Memory                                        | <b>PTAT</b> | Proportional To Absolute Temperature |
| <b>NVRAM</b> | Non-Volatile Random-Access Memory                          | <b>PV</b>   | PhotoVoltaics                        |

## O

|             |                                                                       |
|-------------|-----------------------------------------------------------------------|
| <b>ODT</b>  | On-die Termination                                                    |
| <b>OEM</b>  | Original-Equipment Manufacturer                                       |
| <b>OFDM</b> | Orthogonal Frequency-Division Multiplexing                            |
| <b>OIF</b>  | Optical-Internetworking Forum                                         |
| <b>OIP2</b> | Output-referred intercept point for 2 <sup>nd</sup> -order distortion |
| <b>OIP3</b> | Output-referred intercept point for 3 <sup>rd</sup> -order distortion |
| <b>OLED</b> | Organic LED                                                           |
| <b>ONO</b>  | Oxide-Nitride-Oxide                                                   |
| <b>OOK</b>  | On-Off Keying                                                         |
| <b>OSR</b>  | Over-Sampling Ratio                                                   |
| <b>OTA</b>  | Operational Transconductance Amplifier                                |
| <b>OTP</b>  | One Time Programmable                                                 |

## P

|                        |                            |
|------------------------|----------------------------|
| <b>P<sub>1dB</sub></b> | 1dB gain-compression Point |
| <b>PA</b>              | Power Amplifier            |
| <b>PA</b>              | Public-Address (system)    |
| <b>PAE</b>             | Power-Added Efficiency     |
| <b>PAM</b>             | Pulse-Amplitude Modulation |
| <b>PAN</b>             | Personal-Area Network      |

|               |                                                                |
|---------------|----------------------------------------------------------------|
| <b>PCB</b>    | Printed-Circuit Board                                          |
| <b>PCH</b>    | Platform Controller Hub                                        |
| <b>PCI</b>    | Peripheral-Component Interconnect                              |
| <b>PCI-X</b>  | PCI Express                                                    |
| <b>PCM</b>    | Pulse-Code Modulation                                          |
| <b>PCS</b>    | Personal Communication Services                                |
| <b>PCU</b>    | Power-management Control Unit                                  |
| <b>PD</b>     | Phase Detector                                                 |
| <b>PD-SOI</b> | Partially-Depleted SOI                                         |
| <b>PDA</b>    | Personal Data Assistant                                        |
| <b>PDN</b>    | Power-Delivery Network                                         |
| <b>PDP</b>    | Power-Delay Product                                            |
| <b>PFD</b>    | Phase and Frequency Detector                                   |
| <b>PGA</b>    | Programmable-Gain Amplifier                                    |
| <b>PGA</b>    | Programmable Gate Array                                        |
| <b>PHEMT</b>  | Pseudomorphic High-Electron-Mobility Transistor                |
| <b>PHEVs</b>  | Plug-in Hybrid Electric Vehicles                               |
| <b>PHY</b>    | PHYsical layer (of a communications protocol)                  |
| <b>PID</b>    | Proportional, Integral, Derivative<br>(a type of control loop) |

|              |                                           |
|--------------|-------------------------------------------|
| <b>PLA</b>   | Programmable Logic Array                  |
| <b>PLC</b>   | Power-Line Communication                  |
| <b>PLD</b>   | Programmable Logic Device                 |
| <b>PLL</b>   | Phase-Locked Loop                         |
| <b>PMOS</b>  | P-channel MOS                             |
| <b>PMOST</b> | PMOS transistor                           |
| <b>PMU</b>   | Power Management Unit                     |
| <b>PNP</b>   | P-type-N-type-P-type bipolar (transistor) |
| <b>PON</b>   | Passive Optical Network                   |
| <b>PoP</b>   | Package-on-Package                        |
| <b>POTS</b>  | Plain-Old Telephone Service               |
| <b>ppm</b>   | parts per million                         |
| <b>PPM</b>   | Pulse-Position Modulation                 |
| <b>PR</b>    | Pseudo Random                             |
| <b>PR</b>    | Partial Response                          |
| <b>PRAM</b>  | Phase-Change RAM                          |
| <b>PRBS</b>  | Pseudo-Random Binary Sequence             |
| <b>PRML</b>  | Partial-Response, Maximum-Likelihood      |
| <b>PROM</b>  | Programmable Read-Only Memory             |
| <b>PSD</b>   | Power Spectral Density                    |
| <b>PSK</b>   | Phase-Shift Keying                        |
| <b>PSNR</b>  | Peak SNR                                  |
| <b>PSRR</b>  | Power-Supply Rejection Ratio              |
| <b>PTAT</b>  | Proportional To Absolute Temperature      |
| <b>PV</b>    | PhotoVoltaics                             |
| <b>PVD</b>   | Physical Vapor Deposition                 |
| <b>PVT</b>   | Process, Voltage, Temperature             |
| <b>PWM</b>   | Pulse-Width Modulation                    |

## Q

|             |                                          |
|-------------|------------------------------------------|
| <b>QPT</b>  | Quadruple Patterning Technology          |
| <b>QAM</b>  | Quadrature Amplitude Modulation          |
| <b>QDR</b>  | Quad Data Rate                           |
| <b>QoS</b>  | Quality of Service                       |
| <b>QPSK</b> | Quadrature Phase-Shift Keying            |
| <b>QVCO</b> | Quadrature Voltage-Controlled Oscillator |
| <b>QVGA</b> | Quarter Video Graphics Array             |

## R

|             |                                  |
|-------------|----------------------------------|
| <b>R/W</b>  | ReadWrite                        |
| <b>RAM</b>  | Random-Access Memory             |
| <b>RAT</b>  | Radio Access Technologies        |
| <b>RBW</b>  | Resolution BandWidth             |
| <b>RDAC</b> | Resistor DAC                     |
| <b>RDF</b>  | Random Dopant Fluctuation        |
| <b>RF</b>   | Radio Frequency                  |
| <b>RFID</b> | RF ID (tag)                      |
| <b>RISC</b> | Reduced-Instruction-Set Computer |
| <b>ROM</b>  | Read-Only Memory                 |
| <b>rms</b>  | root mean square                 |

# ISSCC GLOSSARY

|             |                                                                                             |
|-------------|---------------------------------------------------------------------------------------------|
| <b>ROM</b>  | Read-Only Memory                                                                            |
| <b>RSA</b>  | A public-key cryptographic system, named after: Ron Rivest, Adi Shamir, and Leonard Adleman |
| <b>RSSI</b> | Received-Signal-Strength Indicator                                                          |
| <b>RTL</b>  | Resistor-Transistor Logic                                                                   |
| <b>RTS</b>  | Random Telegraph Signal                                                                     |
| <b>RTZ</b>  | Return-To-Zero                                                                              |
| <b>RX</b>   | Receiver                                                                                    |
| <b>RZ</b>   | Return-to-Zero (see also RTZ)                                                               |

## S

|               |                                           |
|---------------|-------------------------------------------|
| <b>SAL</b>    | Service Abstraction Layer                 |
| <b>SAR</b>    | Successive-Approximation-Register         |
| <b>SAW</b>    | Surface Acoustic Wave                     |
| <b>SATA</b>   | Serial Advanced-Technology Attachment     |
| <b>SC</b>     | Switched-Capacitor                        |
| <b>SCL</b>    | Source-Coupled Logic                      |
| <b>SCP</b>    | Source-Coupled Pair                       |
| <b>SCR</b>    | Silicon Controlled Rectifier              |
| <b>S-DMB</b>  | Satellite Digital-Multimedia Broadcasting |
| <b>SD</b>     | Secure Digital (package type)             |
| <b>SDI</b>    | Software-Defined Infrastructure           |
| <b>SDN</b>    | Software-Defined Networking               |
| <b>SDR</b>    | Software Defined Radio                    |
| <b>SDRAM</b>  | Synchronous Dynamic Random-Access Memory  |
| <b>SEM</b>    | Scanning Electron Microscope              |
| <b>SER</b>    | Soft-Error Rate                           |
| <b>SER</b>    | Symbol-Error Rate                         |
| <b>SerDes</b> | Serializer/Deserializer                   |
| <b>SFDR</b>   | Spurious-Free Dynamic Range               |
| <b>SFI</b>    | Serdess Framer Interface                  |
| <b>SFP</b>    | Small Form-factor Pluggable               |
| <b>S/H</b>    | Sample-and-Hold                           |
| <b>SHA</b>    | Sample-and-Hold Amplifier                 |
| <b>SiC</b>    | Silicon Carbide                           |
| <b>SiGe</b>   | Silicon Germanium                         |
| <b>SiGe:C</b> | Silicon, Germanium, Carbon                |
| <b>SIL</b>    | Safety Integrity Level                    |
| <b>SILC</b>   | Stress-Induced Leakage Current            |
| <b>SIMD</b>   | Single-Instruction Multiple-Data          |
| <b>SINAD</b>  | Signal-to-Noise And Distortion (ratio)    |
| <b>SIO</b>    | Synchronous I/O                           |
| <b>SIP</b>    | Single-Inline Package                     |
| <b>SiP</b>    | System in (a) Package                     |
| <b>SL</b>     | Searchline                                |
| <b>SMP</b>    | Symmetric Multi-Processing                |
| <b>SMS</b>    | Short-Messaging Service                   |
| <b>SNDR</b>   | Signal-to-Noise and Distortion Ratio      |
| <b>SNM</b>    | Static Noise Margin                       |
| <b>SNR</b>    | Signal-to-Noise Ratio                     |
| <b>SNS</b>    | Social-Network Service                    |
| <b>SoC</b>    | System on (a) Chip                        |
| <b>SOI</b>    | Semiconductor on Insulator                |
| <b>SONET</b>  | Synchronous Optical NETwork               |
| <b>SONOS</b>  | Silicon-Oxide-Nitride-Oxide-Silicon       |
| <b>SOS</b>    | Silicon On Saphire                        |
| <b>SP</b>     | Simple Profile                            |
| <b>SPAD</b>   | Single-Photon Avalanche Diode             |
| <b>SPDT</b>   | Single Pole Double Throw                  |
| <b>SPI</b>    | System Packet Interface                   |
| <b>SRAM</b>   | Static Random-Access Memory               |
| <b>SSB</b>    | Single Side-Band                          |
| <b>SSC</b>    | Spread Spectrum Clocking                  |
| <b>SSCG</b>   | Spread Spectrum Clock Generator           |
| <b>SSD</b>    | Sold State Disk                           |
| <b>SSN</b>    | Simultaneous Switching Noise              |
| <b>SSO</b>    | Simultaneous Switching Output             |
| <b>SSTL</b>   | Stub Series Terminated Logic              |
| <b>SXGA</b>   | Super Extended Graphics Array             |

## T

|                 |                                                         |
|-----------------|---------------------------------------------------------|
| <b>TC</b>       | Temperature Coefficient                                 |
| <b>TCAM</b>     | Ternary Content-Addressable Memory                      |
| <b>TCON</b>     | Timing Controller                                       |
| <b>TCP</b>      | Transmission Control Protocol                           |
| <b>TDC</b>      | Time-to-Digital Converter                               |
| <b>TDDB</b>     | Time-Dependent Dielectric Breakdown                     |
| <b>TDM</b>      | Time-Division Multiplexing                              |
| <b>TDMA</b>     | Time-Division Multiple-Access                           |
| <b>TD-SCDMA</b> | Time-Division Synchronous Code-Division Multiple Access |

|               |                                           |
|---------------|-------------------------------------------|
| <b>TEM</b>    | Tunneling-Electron Microscope             |
| <b>TFLOPS</b> | Tera Floating-Point Operations Per Second |
| <b>TFT</b>    | Thin-Film Transistor                      |
| <b>TIA</b>    | TransImpedance Amplifier                  |
| <b>T/H</b>    | Track and Hold                            |
| <b>THA</b>    | Track-and-Hold Amplifier                  |
| <b>THD</b>    | Total Harmonic Distortion                 |
| <b>THD+N</b>  | THD plus Noise                            |
| <b>TLB</b>    | Translation Lookaside Buffer              |
| <b>ToF</b>    | Time of flight                            |
| <b>TOPS</b>   | Tera-Operations Per Second                |
| <b>TSV</b>    | Through-Silicon Via                       |
| <b>TTL</b>    | Transistor-Transistor Logic               |
| <b>TV</b>     | Television                                |
| <b>TWh</b>    | Tera-Watt hour                            |
| <b>TX</b>     | Transmitter                               |

## U

|              |                                                |
|--------------|------------------------------------------------|
| <b>UD</b>    | Ultra-high Definition                          |
| <b>UDTV</b>  | Ultra-high-Definition TeleVision               |
| <b>UGC</b>   | User-Generated Contents                        |
| <b>UHDV</b>  | Ultra-High-Definition Video                    |
| <b>UHF</b>   | Ultra-High Frequency                           |
| <b>UI</b>    | Unit Interval                                  |
| <b>UI</b>    | User Interface                                 |
| <b>UIPP</b>  | UI <sub>pp</sub> (peak-to-peak)                |
| <b>U-NII</b> | Unlicensed National Information Infrastructure |
| <b>UMPC</b>  | Ultra-Mobile-PC                                |
| <b>UMTS</b>  | Universal Mobile-Telecommunication System      |
| <b>UPROM</b> | Unerasable Programmable Read-Only Memory       |
| <b>USB</b>   | Universal Serial Bus                           |
| <b>UTP</b>   | Unshielded Twisted Pair                        |
| <b>UWB</b>   | Ultra-WideBand                                 |
| <b>UXGA</b>  | Ultra-eXtended Graphics Array                  |
| <b>UPF</b>   | Unified Power Format                           |

## V

|                      |                                            |
|----------------------|--------------------------------------------|
| <b>V<sub>t</sub></b> | MOS transistor threshold voltage           |
| <b>V<sub>T</sub></b> | thermal voltage                            |
| <b>VCCS</b>          | Voltage-Controlled Current Source          |
| <b>VCDL</b>          | Voltage-Controlled Delay Line              |
| <b>VCO</b>           | Voltage-Controlled Oscillator              |
| <b>VCVS</b>          | Voltage-Controlled Voltage-Source          |
| <b>VCXO</b>          | Voltage-Controlled Crystal Oscillator      |
| <b>VCSEL</b>         | Vertical-Cavity Surface-Emitting Laser     |
| <b>VDMOS</b>         | Vertically Diffused MOS                    |
| <b>VDSL</b>          | Very high bit-rate Digital Subscriber Line |
| <b>VGA</b>           | Variable-Gain Amplifier                    |
| <b>VGA</b>           | Video Graphics Array                       |
| <b>VLIW</b>          | Very Long Instruction Word                 |
| <b>VLF</b>           | Very Low Frequency                         |
| <b>VLSI</b>          | Very Large-Scale Integration               |
| <b>VoD</b>           | Vision on Demand                           |
| <b>VoIP</b>          | Voice over IP                              |
| <b>VR</b>            | Voltage Regulator                          |
| <b>VRM</b>           | Voltage Regulator Module                   |
| <b>VRT</b>           | Variable-Retention Time                    |
| <b>VSB</b>           | Vestigial Side Band                        |

# ISSCC GLOSSARY

|             |                             |
|-------------|-----------------------------|
| <b>VSoC</b> | Virtual System-on-Chip      |
| <b>VSWR</b> | Voltage Standing-Wave Ratio |
| <b>VTG</b>  | Vertical-Transfer Gate      |

## W

|               |                                                                                            |
|---------------|--------------------------------------------------------------------------------------------|
| <b>WAN</b>    | Wide-Area Network                                                                          |
| <b>WCDMA</b>  | Wideband Code-Division Multiple- Access                                                    |
| <b>WDM</b>    | Wavelength-Division Multiplexing                                                           |
| <b>WebRTC</b> | Web Real-Time Communication                                                                |
| <b>WEP</b>    | Wired-Equivalent Privacy                                                                   |
| <b>WiFi</b>   | Wireless Fidelity;<br>{an interoperability certification for<br>IEEE 802.11 WLAN products} |
| <b>WiMax</b>  | Worldwide Interoperability for Microwave Access<br>(IEEE802.16)                            |
| <b>WL</b>     | Wordline                                                                                   |
| <b>WLAN</b>   | Wireless Local-Area Network                                                                |
| <b>WLCG</b>   | Worldwide LHC Computing Grid                                                               |
| <b>WSN</b>    | Wireless Sensor Node                                                                       |

## X

|               |                                                 |
|---------------|-------------------------------------------------|
| <b>XAU1</b>   | (10 Gigabit) eXtended Attachment Unit Interface |
| <b>XDR</b>    | Extreme Data Rate                               |
| <b>XENPAK</b> | 10 Gb Ethernet-compatible fiber-optic standard  |
| <b>XFMR</b>   | TransForMeR                                     |
| <b>XP</b>     | Small form-factor pluggable                     |
| <b>XGA</b>    | Extended Graphics Array                         |
| <b>XO</b>     | Crystal Oscillator                              |

## Z

|           |           |
|-----------|-----------|
| <b>ZB</b> | Zettabyte |
|-----------|-----------|



There are a total of 10 tutorials this year on 10 different topics. Each tutorial, selected through a competitive process within each subcommittee of the ISSCC, presents the basic concepts and working principles of a single topic. These tutorials are intended for non-experts, graduate students and practicing engineers who wish to explore and understand a new topic.

The speakers for these tutorials have spent significant amount of time and effort in preparing these tutorials. In addition, the tutorials have been vetted carefully by their respective coordinators (one coordinator for each tutorial). In some cases, the tutorials have gone through several revisions to bring them to the high quality you will experience at ISSCC. I would like to thank all the speakers and the coordinators for their great effort and dedication, and I would like to invite you to take pleasure in attending one or more of these tutorials.

The coordinators for 2018 ISSCC Tutorials are (in the same order as the tutorials): Jiayoon Ru, Jonathan Chang, Jan Genoe, Dejan Markovic, Matt Straayer, Makoto Ikeda, Ping-Ying Wang, Axel Thomsen, Chun-Huat Heng, and Tony Chan Carusone.

**Ali Sheikholeslami**  
ISSCC Education Chair



### T1 Low-Jitter PLLs for Wireless Transceivers

*Xiang Gao, Credo Semiconductor, Fremont, CA*

8:30 AM

PLLs and frequency synthesizers are key building blocks in wireless transceivers. With the trend of higher data-rate, higher carrier frequency and higher order of modulation, the jitter or phase noise requirement becomes more demanding given a limited power budget. This tutorial starts from the fundamentals of PLL jitter and power consumption. Various sources of PLL jitter and power will be identified and analyzed, and design methodologies to optimize them on both the block and system level will be presented. Finally, the working principle and recent advances of the low jitter sub-sampling PLL architecture will be discussed.

**Xiang Gao** received the B.E. degree from the Zhejiang University, Hangzhou, China, in 2004 and the M.Sc. (cum laude) and Ph.D. (cum laude) degrees from the University of Twente, Enschede, The Netherlands, in 2006 and 2010 respectively, both in electrical engineering. From 2010 to 2016, he was a principal engineer and design manager with Marvell Semiconductor, Santa Clara, CA, focusing on wireless transceiver circuits. He is now an Engineering Director with Credo Semiconductor, Milpitas, CA, working on high-speed SerDes. He has published 20 papers and holds 8 patents. He is an IEEE senior member.



### T2 Nonvolatile Circuits for Memory, Logic, and Artificial Intelligence

*Meng-Fan Chang, National Tsing-Hua University, Hsinchu, Taiwan*

8:30 AM

Memory has proven a major bottleneck in the development of energy-efficient chips for IoT applications and artificial intelligence (AI). Recent nonvolatile memory (NVM) devices not only serve as nonvolatile memory macros, but also enable the development of nonvolatile logics (nvLogics) for nonvolatile processors as well as computing-in-memory (CIM) for AI chips. In this tutorial, we begin with an introduction to various NVM technologies (i.e. MRAM, PCM, ReRAM) and the fundamental circuits used in NVM macros. We then review various state-of-the-art circuit techniques for low-power, high-speed on-chip NVM macros. In the third part of the tutorial, we examine some of the challenges involved in the further development of these technologies and review examples of NVM enabled nvLogics (i.e. nvFlipflop, nvSRAM, nvTCAM) for nonvolatile processors and CIMs for AI chips.

**Professor Chang** received his M.S. from Pennsylvania State University and his Ph.D. degree from National Chiao Tung University in Taiwan. He is currently a full Professor at National Tsing Hua University, Taiwan. Before 2006, he has worked in industry over 10 years.

Between 1997 and 2006, Dr. Chang worked in the design of circuits for SRAM/Flash compilers at Mentor Graphics (New Jersey, US), TSMC (Taiwan), and IP Lib (Taiwan). His research interests include circuit design for volatile and nonvolatile memory, in-memory-computing, artificial intelligence chips, and neuromorphic computing.

Since 2010, Professor Chang has co-authored 35+ top-tier conference papers (14 ISSCC, 14 VLSI, 8 IEDM, and 4 DAC), 30+ IEEE journal papers, and 40+ granted US patents. He is an associate editor for IEEE TVLSI, IEEE TCAD, and IEICE Electronics, and has been serving on the TPC for ISSCC, IEDM (Chair of Memory Technology for 2017), A-SSCC, IEEE CAS Society (Chair Elect of NG-TC), and numerous conferences. He has also been serving as the Associate Executive Director for Taiwan's National Program of Intelligent Electronics (NPIE) since 2011.



### T3 Basics of Quantum Computing

*Edoardo Charbon, EPFL, Neuchatel, Switzerland*

8:30 AM

The tutorial introduces quantum computing to a general audience. It begins with the concept of qubit and its representation, followed by 1-qubit quantum gates and qubit measurement. It will then move on to 2-qubit states, entanglement, and 2-qubit gates. Switching gears, the tutorial will discuss quantum Fourier transforms, unitary transforms, and quantum arithmetic. Finally, we will go through a simple quantum algorithm and discuss solid-state implementations of networks of qubits, highlighting the challenges for reading and controlling them. We conclude with future perspectives.

**Edoardo Charbon** received the Diploma from ETH Zurich in 1988, the M.S. degree from UC San Diego in 1991, and the Ph.D. degree from UC Berkeley in 1995 all in electrical engineering and EECS. He was with Cadence Design Systems from 1995 to 2000. In 2000, he joined Canesta, Inc., as its Chief Architect. Since 2002, he has been a faculty member at EPFL and, from 2008-2016, at TU Delft.

He has authored over 250 papers and two books; he holds 20 patents. His current research interests include 3D imaging, advanced biomedical imaging, quantum integrated circuits, and cryo-CMOS for quantum computing and sensing. Dr. Charbon is a Distinguished Visiting Scholar with the W. M. Keck Institute for Space, California Institute of Technology, a Fellow of the Kavli Institute of Nanoscience Delft, a Distinguished Lecturer of the IEEE Photonics Society, and a Fellow of the IEEE.

**T4 Error-Correcting Codes in 5G/NVM Applications***Hsie-Chia Chang, National Chiao Tung University, HsinChu, Taiwan*

10:30 AM

The growing needs of efficient data transmission and storage are driving information delivery technologies to new frontiers. Either in the new radio (NR) links of 5G communications, or in emerging non-volatile memories (NVMs) with continuously increasing capacity, error correcting codes (ECCs) are essential and crucial in maintaining the data correctness. In this tutorial, we will cover major concepts in ECC design and architecture, including multi-Gb/s LDPC-BC, energy-efficient LDPC-CC, and Polar/Turbo decoders that fulfill different requirements in various 5G scenarios. For NVM applications, we will introduce 1-error and 2-error correcting scheme with parallel architectures for NOR flash, as well as BCH and LDPC coding schemes for NAND/3D-NAND, to address low-latency and high-throughput solutions.

**Hsie-Chia Chang** received the B.S., M.S., and Ph.D. degrees from National Chiao Tung University, Hsinchu, Taiwan, in 1995, 1997, and 2002, respectively, all in electronics engineering.

He was with OSP/DE1, MediaTek Corporation, from 2002 to 2003, where he was involved in decoding architectures for combo single chip. In 2003, he joined the faculty of the Electronics Engineering Department, National Chiao-Tung University, where he has been a Professor since 2010. His research interests include algorithms and VLSI architectures in signal processing, in particular, error control codes and crypto-systems. He has published more than 100 IEEE journal/conference papers, and more than 50 U.S./Taiwan patents. Recently, he has focussed on designing high code-rate ECC schemes for flash memory, PUF implementation for secure MCU system, and multi-Gb/s chip implementations for wireless communications.

Dr. Chang served as the Deputy Director General with the Chip Implementation Center, Taiwan, since 2017. He has also served as an Associate Editor of the IEEE Transactions on Circuits and Systems I: Regular Papers since 2012, as well as served as a Technical Program Committee Member of the IEEE Asian Solid-State Circuits Conference from 2011 to 2013, and the International Solid-State Circuits Conference in 2018. He was a recipient of the Outstanding Youth Electrical Engineer Award from the Chinese Institute of Electrical Engineering in 2010, and the Outstanding Youth Researcher Award from the Taiwan IC Design Society in 2011.

**T5 Hybrid Design of Analog-to-Digital Converters***Seng-Pan (Ben) U, University of Macau, Macau, Macau*

10:30 AM

Traditionally, ADC architectures have been sorted into distinct categories such as FLASH, SAR, pipeline, and delta-sigma. Recently, improvements in ADC power, speed, and resolution have been enabled by hybrid approaches that combine techniques from many ADC architectures. Further, this trend of hybrid design is extended beyond the choice of quantizer to include a mix of circuit topologies in key ADC building blocks. The resulting degrees of freedom allow designers to fully optimize their converters, leading to performance levels beyond what can be achieved with conventional architectures. This tutorial will start with a general overview of key ADC architectures (e.g. Flash, SAR, pipeline and delta-sigma), highlighting basic operation and design trade-offs. Second, the architectural hybrid designs in consideration of various quantizer options will be discussed. Last, illustrative examples of hybrid circuit topologies and techniques will be discussed, with emphasis on design choices that enabled performance benefits for the specific application of interest.

**Seng-Pan (Ben) U** received the dual Ph.D. degree from the University of Macau (UM) and Instituto Superior Técnico, Portugal in 2002. He is currently Professor and Deputy Director of the State-Key Laboratory of Analog & Mixed-Signal (AMS) VLSI of UM. He is also the co-founder & corporate R&D director and Macau site general manager of Synopsys Macau Ltd (Former Chipidea Microelectronics Macau).

He has co-authored 200+ publications, 4 books and co-held 14 US patents. He was A-SSCC 2013 tutorial speaker for energy-efficient data converters and SSCS Distinguished Lecturer (2014-2015). He was the co-recipient of the 2014 ESSCIRC Best paper award, and also the advisor for student awards of the SSCS Pre-doc Achievement Award, ISSCC Silk-Road Award, and A-SSCC Student Design Contest in the data converter field. He was the 1st recipient from Macau of the National science & technology (S&T) award and the Ho Leung Ho Lee Foundation award. He has received 7 Macau S&T Awards, 2 business awards, and the government Honorary Title of Value. He is currently IEEE Fellow, was also elected as the Scientific Chinese of the Year 2012, and was recently appointed as a member of the Science and Technology Commission of the China Ministry of Education. He is a member of the TPC of ISSCC, data converter sub-committee chair of A-SSCC, analog sub-committee chair of VLSI-DAT and editorial board member of the Journal AICSP.

**T6 Single-Photon Detection in CMOS***Matteo Perenzoni, Fondazione Bruno Kessler, Trento, Italy*

10:30 AM

Every single photon carries information in position, time, etc. Single-photon devices are now demonstrated and available in several CMOS technologies, but the needed circuits and architectures are completely different from conventional visible light sensors.

This tutorial starts from the description of structure and operation of a single-photon detector, and it continues on the definition of circuits for the front-end electronics needed to efficiently manage the extracted information, addressing challenges and requirements. Then, it concludes with an overview of the different architectures that are specific for each application field, with examples in the biomedical, consumer, and space domain.

**Matteo Perenzoni** received the Laurea degree in Electronics Engineering from the University of Padova, Italy, in 2002. In January 2004, he joined Fondazione Bruno Kessler (FBK), Trento, Italy, as a research scientist, where he now leads the Integrated Radiation and Image Sensors (IRIS) research unit. He has been collaborating as contract professor with the University of Trento, and visiting researcher at TU Delft. His research interests include the design of advanced image sensors from single-photon to multispectral sensing.

**T7 Basics of Adaptive and Resilient Circuits***Keith A. Bowman, Qualcomm, Raleigh, NC*

1:30 PM

Dynamic device and circuit parameter variations degrade processor performance, energy efficiency, yield, and reliability across all market segments, ranging from small embedded cores in an IoT to large multicore servers. This tutorial introduces the primary variations during a processor's operational lifetime, including transient voltage droops, temperature changes, and radiation-induced soft errors, as well as persistent transistor and interconnect aging. This presentation then describes the negative impact of these variations on timing and data retention in logic and embedded memory across a wide range of voltages and clock frequencies. To mitigate these adverse effects from dynamic variations, this tutorial presents adaptive and resilient circuits, while highlighting the key design trade-offs and testing implications for product deployment.

**Keith Bowman** is a Principal Engineer and Manager in the Processor Research Team at Qualcomm Technologies, Inc. in Raleigh, NC. He pioneered the invention, design, and test of Qualcomm's first commercially successful circuit for mitigating the adverse effects of supply voltage droops. He received the Ph.D. degree from the Georgia Institute of Technology and worked at Intel for 12 years. He has published over 70 technical conference and journal papers and presented over 30 tutorials on variation-tolerant circuit designs. He currently serves on the ISSCC technical program committee.



## T8 Fundamentals of Switched-Mode Power-Converter Design

*Hoi Lee, The University of Texas at Dallas, Richardson, TX*

1:30 PM

Switched-mode power converters are very popular in power management IC designs for voltage conversions and are widely adopted for today's smart phones and tablets thanks to their high power-conversion efficiency and high-output-power capability. Their applications also include high-voltage and high-power automotive systems, LED lighting, and renewable energy systems. In this tutorial, switched-mode power-converter fundamentals and the design of CMOS switched-mode power converters will be covered. Topics include an overview of non-isolated power-converter topologies such as buck, boost, and non-inverting buck-boost; loss analysis and comparisons between continuous and discontinuous conduction-mode operations; different control schemes; and frequency compensation between voltage-mode and current-mode operations. Topics also include detailed circuit implementations of building blocks like FET-based and filter-based current sensors, dead-time control and gate drivers and the state-of-the-art designs. Practical design examples are used throughout the presentation. Advanced topic includes the discussions of three-level converters.

**Hoi Lee** received the B.Eng., M.Phil., and Ph.D. degrees in Electrical and Electronic Engineering from the Hong Kong University of Science and Technology. Since 2005, he has been with the Department of Electrical and Computer Engineering, the University of Texas at Dallas, Richardson, TX, where he is a Professor. His research interests include power-management integrated circuits, power-converter topologies and control methodologies, wireless power and energy-harvesting circuits, and analog and mixed-signal integrated circuits.

Dr. Lee serves as an Associate Editor of IEEE Transactions on Circuits and Systems-I Regular Papers. He is also on the Technical Program Committees of ISSCC, CICC, and ISPSD. He has authored or coauthored over 100 peer-reviewed conference and journal papers. He received 2011 NSF CAREER Award and was the recipient of the CICC 2002 Best Student Paper Award.



## T9 Digital RF Transmitters

*Renaldi Winoto, Tectus, Saratoga, CA*

3:30 PM

Direct digital-to-RF conversion at the antenna interface offers many exciting opportunities to push the performance envelope of RF transmitters in efficiency, area, signal bandwidth, and modulation quality. This tutorial will provide a complete overview of digital transmitter architectures, starting from digital bits at the symbol rate all the way to the antenna. We will start at the heart of this architecture, with a review of state-of-the-art, high-efficiency, digital PA topologies. Then we will discuss all auxiliary circuits and digital signal-processing tricks needed around the PA core to enable the transmitter to meet strict signal fidelity and spectrum cleanliness requirement while simultaneously providing the PA core with the optimal environment for highest efficiency. Topics for discussion include: digital sample-rate conversion, 1-D and 2-D digital pre-distortion for wideband signals, and peculiarities of signal processing in the polar domain.

**Renaldi Winoto** received his Bachelors degree from Cornell University in 2003 and his Ph.D. degree from University of California, Berkeley in 2009. He was with Marvell Semiconductors from 2009 to 2017 where he worked as an Engineering Director and led a group responsible for definitions of RF transceiver architectures and designs of RF power amplifiers for Wireless LAN. In 2017, he joined Tectus Corporation, a stealth-mode start-up. He is a member of the technical program committee for RFIC and ISSCC. His research interest is in mixed-signal and radio-frequency circuits for wireless communication systems.



## T10 ADC-Based Serial Links: Design and Analysis

*Sam Palermo, Texas A&M University, College Station, TX*

3:30 PM

Growing serial I/O data rates, over both severe low-pass electrical and dispersive optical channels, necessitate increased equalization complexity and consideration of more bandwidth-efficient modulation schemes, such as four-level pulse-amplitude modulation (PAM4). Serial links which utilize ADC-based receiver front-ends offer a potential solution, as they enable more powerful and flexible DSP for equalization and symbol detection and can easily support advanced modulation schemes. This tutorial will provide an overview of key concepts in ADC-based serial links that support operation over high-loss channels. Topics covered include high-speed ADC topologies, digital equalizers, benefits of partial analog equalization, modeling approaches, and calibration techniques.

**Samuel Palermo** received the B.S. and M.S. degrees in electrical engineering from Texas A&M University, College Station, TX in 1997 and 1999, respectively, and the Ph.D. degree in electrical engineering from Stanford University, Stanford, CA in 2007.

From 1999 to 2000, he was with Texas Instruments, Dallas, TX, where he worked on the design of mixed-signal integrated circuits for high-speed serial data communication. From 2006 to 2008, he was with Intel Corporation, Hillsboro, OR, where he worked on high-speed optical and electrical I/O architectures. In 2009, he joined the Electrical and Computer Engineering Department of Texas A&M University where he is currently an associate professor. His research interests include high-speed electrical and optical interconnect architectures, RF photonics, high performance clocking circuits, and integrated sensor systems.

Dr. Palermo is a recipient of a 2013 NSF-CAREER award. He is a member of Eta Kappa Nu and IEEE. He has served as an associate editor for IEEE Transactions on Circuits and System – II from 2011 to 2015 and has served on the IEEE CASS Board of Governors from 2011 to 2012. He is currently a distinguished lecturer for the IEEE Solid-State Circuits Society. He was a coauthor of the Jack Raper Award for Outstanding Technology-Directions Paper at the 2009 International Solid-State Circuits Conference, the Best Student Paper at the 2014 Midwest Symposium on Circuits and Systems, and the Best Student Paper at the 2016 Dallas Circuits and Systems Conference. He received the Texas A&M University Department of Electrical and Computer Engineering Outstanding Professor Award in 2014 and the Engineering Faculty Fellow Award in 2015.

**F1: Intelligent Energy-Efficient Systems at the Edge of IoT**

**Organizer:** **Vivek De**, Intel, Hillsboro, Oregon

**Committee:** **Dennis Sylvester**, University of Michigan, Ann Arbor, MI

**James Myers**, ARM, Cambridgeshire, United Kingdom

**Jun Deguchi**, Toshiba Memory Corporation, Kawasaki, Japan

**Shinichiro Shiratake**, Toshiba Technology Development, Yokohama, Japan

**Ingrid Verbauwhede**, KU Leuven, Leuven, Belgium

Energy efficiency as well as security of ubiquitous smart, secure & connected devices at the edge nodes of IoT are critical for realizing robust and intelligent end-to-end cyberphysical systems that deliver compelling new experiences and capabilities based on deep learning and other natural computing technologies for big data applications. This forum brings together low-power SoC as well as sensor, data converter, memory & wireless circuits designers, and machine learning & hardware security experts, to discuss the challenges and recent advances in the development of these edge node devices. All key hardware technologies are covered. The first speaker presents smart & energy-efficient integrated sensor technologies. The second speaker then discusses low power analog front end & data converter circuits. The third speaker provides an overview of the various intelligent and energy-efficient memory & storage technologies and systems. Ultra-low-power wireless connectivity circuits and systems are discussed by the next speaker. The fifth speaker then presents compressive imaging techniques for CMOS image sensors. Ultra-low-power SoC designs with intelligent compute and learning engines for efficient sensor data processing are covered by the next speaker. The seventh speaker presents the emerging natural computing hardware with Ising model for combinatorial optimizations. The last speaker presents the key building blocks and techniques for implementing hardware security at the IoT edge nodes.

**Smart and Energy-Efficient Integrated Sensors at the Edge of the IoT**

*Bruno W. Garlepp, TDK, San Jose, CA*

As the Internet-of-Things continues to push further and further into our environment and our everyday lives, its ability to connect into our physical world to enrich our experiences is largely defined by the sensors placed along its edges. These sensors must operate at or beyond the level of human experience to provide meaningful information, but must likewise be cheap and consume very little power to enable the IoT to access such information ubiquitously. This presentation explores this conundrum between sensor performance and sensor cost in terms of complexity and power consumption which drives their attachment to the IoT. Multiple examples are given that illustrate how clever design choices at the transducer, circuit, system, and packaging levels can be applied to sensors and sensor systems to achieve energy, complexity, and cost efficiency by being smart about balancing performance vs. power consumption, data transmission vs. local processing, and integration vs. separation.

**Bruno W. Garlepp** has over 20 years of industrial experience bringing innovative mixed-signal ICs into mass production. He received the B.S. degree in Electrical Engineering from UCLA in 1993, and the M.S. degree in Electrical Engineering from Stanford in 1995.

**Low Power Analog Front Ends and Data Converters**

*Gabriele Manganaro, Analog Devices, Wilmington, MA*

The IC design of analog/mixed-signal conditioning and data conversion circuits for IoT presents several important technical challenges because of conflicting demands in absolute power consumption, signal processing and conversion efficiency, physical size, process technology and implementation costs among others. This presentation will discuss such requirements and tradeoffs and it will cover both established and emerging architectures, as well as circuit design techniques for analog front ends and analog-to-digital converters for IoT devices. Selected case studies will be presented in order to illustrate the wide diversity of actual applications, as well as to exemplify the applicability of the discussed concepts.

**Gabriele Manganaro** holds a Dr.Eng. and a Ph.D. degree in Electronics from the University of Catania, Italy. Beginning in 1994, he did research with ST Microelectronics and at Texas A&M University. He worked in the IC design of data converters at Texas Instruments, Engim, Inc., and as Design Director at National Semiconductor. Since 2010, he is Engineering Director for high-speed converters at Analog Devices. He served on the data-converters technical sub-committee of the ISSCC for seven years. He was Associate Editor for IEEE Trans. On Circuits and Systems – Part II, and then Associate Editor, Deputy Editor-in-Chief and finally, Editor-in-Chief for IEEE Trans. On Circuits and Systems – Part I. He has authored/co-authored more than 60 papers, three books (notably “Advanced Data Converters”, Cambridge University Press, 2011) and has been granted 15 US patents, with more pending. He has received several scientific awards, including the 1995 CEU Award from the Rutherford Appleton Laboratory (UK), the 1999 IEEE Circuits and Systems Outstanding Young Author Award, and the 2007 IEEE European Solid-State Circuits Conference Best Paper Award. He is an IEEE Fellow (since 2016), a Fellow of the IET (since 2009), Member of Sigma Xi, and a member of the Board of Governors of the IEEE Circuits and Systems Society (2016-2018).



## Intelligent Energy-Efficient Memory/Storage Systems for IoT Edge Devices and IoT Edge Gateways

*Shinobu Fujita, Toshiba, Kawasaki, Japan*

This paper presents energy-efficient IoT memory/storage systems based on volatile/"semi-nonvolatile"/nonvolatile memory (NVM) combinations. Multiple case studies of these memory/storage systems are discussed for IoT edge devices and IoT edge gateways. Features of various IoT-edge NVMs such as MRAM, ReRAM, FeRAM, flash memories and OTP are also extracted from these case studies.

**Shinobu Fujita** received the Ph.D. from University of Tokyo and joined Toshiba in 1989. He has been working on new non-volatile memory (NVM) circuits and systems designs for over 15 years. His major designs are ReRAM-based NV logic, ReRAM-based FPGA, NVM-based random number generators, NV-SRAM-based cache memories and normally-off processors with e-STT-MRAM used in applications ranging from cloud computing to IoT/wearables. Currently, he is a Senior Fellow of the Toshiba Corporate R&D Center and is leading a project on applying NVM for energy efficient computing.



## Ultra-Low-Power Wireless Connectivity

*Carolynn Bernier, CEA-LETI-MINATEC, Grenoble, France*

The IoT is enabled by a large number of wireless technologies. Among them, low-power wide-area networks (LPWAN) are an exciting and disruptive technology that has opened up a new applications space. The ensuing enthusiasm observed in the wireless community has led to the development of a plethora of long-range wireless connectivity solutions. This diversity inevitably generates a complex environment for designers, from circuits to applications. After a brief discussion on the fundamental limitations of LPWAN technologies and a presentation of wireless ICs for LPWAN, this talk will focus on software-defined initiatives for reducing the uncertainty of this complex environment. The talk will also discuss uncertainty of the wireless medium itself and investigate channel-aware radios for even lower power performance.

**Carolynn Bernier** received the B.A.Sc. degree in Computer Engineering from the University of Toronto in 1998, and the Ph.D. in Microelectronics from the National Polytechnical Institute of Grenoble, France, in 2003. Since then, she has been with the RF IC Design and Architectures laboratory of CEA-LETI, a French public research institute.



## Compressive Imaging on CMOS Image Sensors

*Yusuke Oike, Sony Semiconductor Solutions, Atsugi, Japan*

Image sensors have continued to evolve to produce more natural images – closer to “human perception” – as applications expand from video cameras to digital still cameras and smartphones. Now machine learning has progressed dramatically, and AI systems demand that image sensors to extract feature information efficiently as “machine-to-machine perception”. This talk will introduce the state-of-the-art energy-efficient imaging techniques for machine-to-machine perception, for example, region-of-interest (ROI) imaging, feature-extraction vision sensing, temporal dynamic vision sensing, and compressive sensing.

**Yusuke Oike** received the B.S., M.S., and Ph.D. degrees in electronic engineering from the University of Tokyo, Tokyo, Japan, in 2000, 2002, and 2005, respectively. In 2005, he joined Sony Corporation, Tokyo, Japan, where he has been involved in research and development of architectures, circuits, and devices for CMOS image sensors. From 2010 to 2011, he was a Visiting Scholar at Stanford University, Stanford, CA. He is currently a General Manager with Sony Semiconductor Solutions Corporation, where he is in charge of the development of CMOS image sensors. His current research interests include pixel architecture and mixed-signal circuit design for image sensors and image processing algorithms.



## Ultra-Low-Power SoCs for Local Sensor Data Processing in a Sustainable IoT

*David Bol, Université Catholique de Louvain, Louvain-la-Neuve, Belgium*

The IoT is changing the way we live. However, challenges remain for sustainable deployment of hundreds billion sensors at the IoT edge, in terms of battery replacement and wireless data traffic. To alleviate this, we need to process the data locally on the sensor node in order to extract, encrypt then transmit only the meaningful information. In this forum, we will first review circuit techniques to maximize the energy efficiency of complex data processing tasks in microcontroller-type SoCs: ultra-low-voltage ( $ULV < 0.5V$ ) operation, CMOS technology scaling, custom SRAM design and adaptive PVT compensation. As software execution does not yield sufficient computing performance within typical power budgets, we will review and compare recent architectural trends for sensor data processing: approximate computing for feature extraction, in-memory computing for classification, deep neural networks for inference and spiking neural network processors with on-line learning.

**David Bol** is an assistant professor at Université catholique de Louvain (UCL). He received the Ph.D. degree in Engineering Science from UCL in 2008 in the field of ultra-low-power digital nanoelectronics. In 2005, he was a visiting Ph.D. student at the CNM, Sevilla, Spain, and in 2009, a postdoctoral researcher at intoPIX, Louvain-la-Neuve, Belgium. In 2010, he was a visiting postdoctoral researcher at the UC Berkeley Lab for Manufacturing and Sustainability, Berkeley, CA. In 2015, he participated in the creation of e-peas semiconductors, Louvain-la-Neuve, Belgium. He leads the Electronic Circuits and Systems (ECS) research group focused on ultra-low-power design of integrated circuits for the IoT including computing, power management, sensing and RF communications with focuses on technology/circuit interaction in nanometer CMOS nodes, variability mitigation, mixed-signal SoC architecture and implementation. Prof. Bol has authored more than 90 papers and conference contributions and holds three issued patents. He (co-)received three Best Paper/Poster/Design Awards in IEEE conferences (ICCD 2008, SOI Conf. 2008, FTFC 2014). He serves as a reviewer for various IEEE journals/conferences and he presented several keynotes in international conferences.



### A Natural Computing with Ising Model to Solve Combinatorial Optimization Problems

*Masanao Yamaoka, Hitachi, Tokyo, Japan*

A new-paradigm computing, a natural computing, is proposed to overcome the computing performance saturation due to the end of semiconductor scaling. To solve combinatorial optimization problems efficiently, a natural computing using Ising model, Ising computing, is developed. The computing maps problems to an Ising model, a model to express the behavior of magnetic spins, and solves the problems by its own convergence property. A CMOS prototype chip based on the Ising computing is fabricated and confirmed that the power efficiency of the chip is 1800-times higher than that of the conventional von-Neumann computers. In this talk, in addition to the introduction of the CMOS Ising computing, the applications including AI are also discussed.

**Masanao Yamaoka** received the B.E., M.E., and Ph.D. degrees in Electronics and Communication Engineering from Kyoto University, Kyoto, Japan, in 1996, 1998, and 2007, respectively. In 1998, he joined the Central Research Laboratory, Hitachi, Ltd., Tokyo, Japan, where he engaged in the research and development of low-power embedded SRAM and CMOS circuits. Since 2012, he has been engaged in the research of new-paradigm computing using CMOS circuits.



### Hardware Security at the Edge of IoT

*Sanu Mathew, Intel, Hillsboro, OR*

Device authentication and secure data-transfer are critical operations for enabling high-volume deployment of IoT edge devices. The ultra-low area and energy constraints of these energy-harvesting devices, along with the pervasive threat of malicious attacks impose significant challenges to the design of reliable, attack-resilient security hardware primitives for IoT edge devices. In this presentation, we will describe the state-of-the-art in: i) strong Physically Unclonable Function (PUF) circuits for secure authentication with large challenge-response-space and machine-learning attack resistance; ii) low-area True-Random-Number Generators (TRNG) using light-weight entropy extractors; and, iii) side-channel-attack resistant AES hardware accelerators targeted for use at the edge of the IoT.

**Sanu Mathew** is a Senior Principal Engineer with the Circuits Research Labs at Intel Corporation, Hillsboro, OR, where he is responsible for developing energy-efficient hardware accelerators for encryption and security. Sanu got his Ph.D. degree in Electrical and Computer Engineering from the State University of New York at Buffalo in 1999. He holds 41 issued patents, and has 63 patents pending, and has published over 77 conference/journal papers. He has been with Intel for the past 18 years.

## F2: FinFETs & FDSOI – A Mixed Signal Circuit Designer's Perspective



**Organizers:** **Venkatesh Srinivasan**, Texas Instruments, Dallas, TX  
**Stéphane Le Tual**, STMicroelectronics, Crolles, France  
**Tai-Cheng Lee**, National Taiwan University, Taipei, Taiwan

**Committee:** **John Long**, University of Waterloo, Waterloo, Canada  
**Xin He**, NXP, Eindhoven, The Netherlands  
**Jaeha Kim**, Seoul National University, Seoul, Korea

**Moderator:** **David Robertson**, Analog Devices, Wilmington, MA

With circuit design in the nanometer regime, designers have a choice between transistors in FinFETs or FDSOI technology. This Forum brings together experts from industry and academia to discuss the opportunities these technologies present for designers. The experts will present the physics/modeling of FinFETs & FDSOI transistors followed by an in-depth analysis of design considerations for modules spanning analog/mixed-signal to mm-Wave regimes. The Forum will conclude with a panel discussion with the speakers.



### FinFET Basics: Reconsideration of the Capabilities from Fundamental Device Concepts

*Digh Hisamoto, Hitachi, Kokubunji, Tokyo, Japan*

The 3-D channel structure of the FinFET has caused controversy because of the strange appearance. Since we proposed the structure, there have been many discussions about advantages and disadvantages in comparison with conventional planar MOSFETs and FDSOI MOSFETs. However, most of the discussions were based on the structures that were designed from the need of reducing the risk accompanied with the early stage of mass production. Thus, many parasitic effects concealed the true nature.

Today, as many semiconductor manufacturers offer the 2<sup>nd</sup>- or the 3<sup>rd</sup>-generation FinFETs, the process and device structures have become sophisticated and mature. Here, to estimate the real capabilities of FinFETs, I will reconsider the fundamental device concepts we expected when we developed the Fin-structure. The discussion enables to open up opportunities for the suitable applications of FinFETs.

**Digh Hisamoto** received the B.S., M.S. degrees in reaction chemistry and the Ph. D. degree in electronic engineering from the University of Tokyo, Tokyo, Japan, in 1984, 1986, and 2003, respectively.

In 1986, he joined Central Research Laboratory, Hitachi Ltd., Tokyo, where he has been working on ULSI device physics and process technologies. He developed scaled CMOS devices and memory devices including DELTA (fully depleted lean-channel transistor), the original model of the FinFET.

From 1997 to 1998, he was a Visiting Industrial Fellow at the University of California, Berkeley, where he developed the first FinFETs.

Since 2000, he has developed embedded non-volatile Flash memories using split-gate MONOS charge-trapped technology. Currently, he has expanded the research interests into RF devices, tunnel FETs and wide-gap semiconductor power devices and sensing devices.

He served as a member of the technical program committees of the International Conference on Solid State Devices and Materials (SSDM), the International Electron Devices Meeting (IEDM) and the VLSI Technology Symposium, and currently serves as an Organizing Committee Member of SSDM and an Executive Committee Member of the VLSI Symposia.

He served as Director of Japan Applied Physics (JSAP), from 2011 to 2013. Since 2015, he has been a Visiting Professor of the School of Engineering, Tokyo Institute of Technology.

Dr. Hisamoto is a Fellow of IEEE and a Fellow of the Japan Society of Applied Physics (JSAP).



### RF & mm-Wave Design in FinFET Technology

*Steven Callender, Intel, Hillsboro, OR*

The integration of mm-wave wireless systems in CMOS processes enables low-cost and easily scalable solutions. However, fully integrated SoCs at mm-wave frequencies remain an elusive goal owing to the fact that although scaling continues to benefit digital baseband chips, it generally has an adverse impact on the RF/mm-wave front-end. In this talk, we will investigate the mm-wave performance offered by today's FinFET process nodes. We will provide practical insight into designing in a FinFET process from a designer's perspective. Key process performance metrics of both actives and passives, which have direct implications on mm-wave performance, will be analyzed and discussed. Furthermore, techniques for improving these key metrics (e.g. efficiency and bandwidth) will also be presented. Example mm-wave designs in Intel's FinFET process will be used as a case study.

**Steven Callender** received the B.S. degree in EE from Columbia University in 2008, and the M.S. and Ph.D. degrees in EE from UC Berkeley in 2010 and 2015, respectively. In 2015, he joined Intel Labs as a Research Scientist focusing on the development of next-generation wireless systems. His research interests include RF/mm-wave circuits and wideband mixed-signal systems.



### FDSOI Basics – Physics, Device Performance and RF & mm-Wave Design Enablement

*Patrick Scheer, STMicroelectronics, Crolles, France*

This talk will cover various aspects of Ultra-Thin Body and Box (UTBB) FDSOI technology, from device physics and performance to design enablement for analog, RF & mm-wave circuits. After highlighting the main device architecture differences with bulk and some unique features of FDSOI technology, the challenges and possible solutions for compact modeling of UTBB FDSOI MOS transistors in DC and RF & mm-wave domains will be addressed. Device performance and design enablement capability will finally be illustrated using STMicroelectronics 28nm FDSOI technology and its CAD methodology powered by the LETI-UTSOI2 compact model.

**Patrick Scheer** received the engineering degree in electronics from the Ecole Nationale Supérieure d'Electronique et de Radioélectricité de Grenoble and the M.S. degree in optics, optoelectronics and microwaves from the Institut National Polytechnique de Grenoble in 1993. He received the Ph.D. degree in optics and optoelectronics from the Ecole Nationale Supérieure de l'Aéronautique et de l'Espace, Toulouse, France, in 1998.

He joined the central R&D site of STMicroelectronics in Crolles, France, in 1998 to develop high-frequency models for MOS transistors in advanced CMOS and BiCMOS technologies. He then set up and led an analog & RF SPICE modeling team focused on active devices. Since 2012, he has been working on advanced modeling solutions for analog & RF designs and is a senior member of technical staff.

His interests are in semiconductor device physics, small-signal, noise and large-signal behavior of active devices, compact modeling, parameter extraction methodologies and device variability modeling, in both bulk and FDSOI technologies. He is the co-author of more than 30 international journal and conference papers in the field.



### Millimeter-Wave FDSOI Power Amplifiers for 5G Mobile Communications

*Eric Kerhervé, University of Bordeaux, Talence, France*

The next generation of high-data-rate mobile communication systems (5G) will require a drastic reduction of the power consumption and offer a more flexible use of the RF-to-mm-wave spectrum. In this talk, we will propose our technical solutions for 28/37/60GHz power amplifiers using the FDSOI CMOS technology. This promising technology meets the 5G requirements by using the backgate of the transistors to achieve a multimode PA configuration (high gain/high linearity), changing in real time current and voltage biases and thus improving the power consumption over a large power back-off. We will also introduce recent results of a self-contained FDSOI 28GHz PA to achieve a good robustness to the phased-array-antenna load variations.

**Eric Kerhervé** received the Ph.D. degree in Electrical Engineering from Bordeaux University, France in 1994. He is a full professor in the Bordeaux Polytechnic Institute and ensures the function of director of the STMicroelectronics/IMS joint Lab and deputy director of the IMS research institute in Bordeaux. His research activities focus on the design of RF and millimeter-wave power amplifiers in Silicon and GaN technologies. He is or was involved in many European projects to design silicon RF/mm-wave power amplifiers for mobile communications. He has authored or co-authored more than 200 technical papers in this field, and was awarded 24 patents. He has organized 8 RFIC/MMT workshops on advanced silicon technologies for RF and mm-wave applications. He was co-chair of IEEE ICECS'2006, IEEE NEWCAS'2011 and chair of EuMIC'2015. He is a member of the Executive Committee of SiRF and a member of the IEEE MTT Microwave and Millimeter-Wave Solid State Devices committee.



### FinFETs for Analog & Mixed-Signal Designs

*Lawrence Loh, MediaTek, San Jose, CA*

High-performance and cloud computing are the high-growth applications that are driving product and technology innovation today. The continued scaling of planar CMOS, driven by Moore's Law, has achieved greater digital density, but has also resulted in increased leakage power, which makes 3D FinFETs an attractive alternative. Transitioning from planar devices to FinFETs for analog & mixed-signal design, however, is not a straightforward task and is full of challenges, but also presents new opportunities. The benefits of FinFETs over planar devices are lower leakage, lower variability, and openings for simpler circuit architectures. Also, highly digitally assisted analog circuits become even more effective. Unfortunately, higher parasitics and quantized device sizes are drawbacks which must be designed around, often requiring new circuit techniques. In this forum, various analog & mixed-signal FinFET circuit examples will be given, such as OPAMP, SRAM, AD/DA converters, and high-speed circuits. In addition, new design strategies required for FinFET technology will be discussed. At the conclusion, an issue that is of particular importance to fabless semiconductor companies – the increased design complexity of FinFETs resulting in longer design cycles and time-to-market – will be touched upon.

**Lawrence Loh** is a Corporate Senior Vice President of MediaTek Inc. He oversees the company's Central Engineering Group, responsible for engineering the company's SOC and chipset designs, development and implementation activities for all of MediaTek's product lines including mobile communication, application processors, wireless connectivity, IoT, automotive, home entertainment, optical storage and broadband/networking business. He is also serving as President of MediaTek USA, Inc., responsible for the company's global operations in Europe and America.

Dr. Loh started his first circuit design position at IMP and later he joined Cirrus Logic, where his last position was Director of Analog IC Engineering. In 1998, he founded Silicon Bridge Inc., where he successfully led a number of analog/mixed-signal IC development projects with major semiconductor companies including MediaTek and Altera Corporation. Before joining MediaTek in 2004, he contributed to the IC design industry in areas of read/write channels for magnetic and optical storage, high-performance analog filters, solid-state fingerprint sensors, high-speed SERDES and wireline transceivers for various business applications. He received his Ph.D. degree in Electrical Engineering from Texas A&M University, College Station, Texas. He has authored/co-authored dozens of technical papers/patents in areas of analog and mixed-signal integrated circuits/systems design and has contributed many panel talks and invited keynote speeches at numerous international conferences and professional communities. He served on the ISSCC International Technical Program Committee for 5 consecutive years since 2005. He is currently serving on the Steering Committee of A-SSCC and also on the Board of Directors for the Global Semiconductor Alliance (GSA).



### High Speed Transceivers Using FinFETs

*Ken Chang, Xilinx, San Jose, CA*

This talk will discuss several architecture and circuit design decisions for high-speed transceivers related to scaled and specifically FinFET technology. It will also discuss the noise impact from an example FinFET process that affects the design choice. Case studies based on 28G and 56G long reach wireline transceivers in 16nm FinFET technology will be used to illustrate the design challenges and decisions.

**Ken Chang** (M 99, SM 14) received the B.S. degree in electrical engineering from National Taiwan University, Taipei, Taiwan, in 1990, and the M.S. and Ph.D. degrees in electrical engineering from Stanford University, Stanford, CA, USA, in 1994 and 1999, respectively.

From 1999 to 2010, he was with Rambus Inc. He led several projects including a 5Gb/s/lane 12Gbyte FlexIOTM interface for CELLTM processors, as well as 16Gb/s and 20Gb/s low-power memory interfaces exploring various signaling techniques. Since 2010, he has been with Xilinx Inc., and led the SerDes technology group, focused on developing multistandard SerDes IPs for FPGAs, covering top line rates from 10Gb/s, 28Gb/s, and 56Gb/s, all capable of long-reach backplane data transmission. His research interests include high-speed mixed-signal CMOS circuit design, transmitter and receiver design, CDR, equalization, PLL/DLL design, circuit noise analysis, signal integrity analysis, and mixed-signal design methodology.

He has authored and coauthored 40+ IEEE conference/journal publications and hold 30+ U.S. patents in the high-speed link area. He has served on the technical program committee for the VLSI circuit symposium since 2009 and is now the technical program chair of the 2018 VLSI circuit symposium. He also has served on technical program committees for ISSCC and CICC. He is the co-author of 2008 and 2014 CICC best regular papers. He is a senior member of IEEE.



### The Fourth Terminal in FDSOI

*Borivoje Nikolić, University of California at Berkeley, Berkeley, CA*

Technology and design options available in FDSOI will be discussed. Particular attention will be paid to the use of body bias in digital, SRAM and mixed-signal designs. The design options will be explored on building blocks of a system-on-a-chip, and will include processor core, cache memories, supply regulation and data conversion. A series of energy-efficient microprocessors will be used as illustrative examples. They are based on an open and free Berkeley RISC-V architecture and implement several techniques for operation in a very wide voltage range utilizing 28nm FDSOI.

**Borivoje Nikolić** is the National Semiconductor Distinguished Professor of Engineering at the University of California, Berkeley. He received the Dipl.Ing. and M.Sc. degrees in electrical engineering from the University of Belgrade, Serbia, in 1992 and 1994, respectively, and the Ph.D. degree from the University of California at Davis in 1999. His research activities include digital, analog and RF integrated circuit design and communications and signal processing systems. He is co-author of the book Digital Integrated Circuits: A Design Perspective, 2nd ed, Prentice-Hall, 2003. Dr. Nikolić received many awards in his career, including the NSF CAREER award in 2003, and best paper awards at the IEEE International Solid State Circuits Conference, the Symposium on VLSI Circuits, the IEEE International SOI Conference, the European Solid-State Circuits Research Conference, the European Solid-State Device Research Conference, the S3S conference and the ACM/IEEE International Symposium of Low-Power Electronics.



### RF, mm-Wave and Fiber-Optics Design in FDSOI CMOS Technologies

*Sorin P. Voinigescu, University of Toronto, Toronto, Canada*

This presentation will discuss the main features of FDSOI CMOS technology and how to efficiently use its unique features for RF, mm-wave and broadband SoCs. We will overview the impact of the back-gate bias on the measured I-V, transconductance, fT and fMAX characteristics and compare the maximum available gain, MAG, of FDSOI MOSFETs with those of planar bulk CMOS and SiGe BiCMOS transistors through measurements up to 325 GHz. Next, examples will be provided of VCO, doubler, switches, PA and quasi-CML circuit topologies and layouts that make efficient use of the back-gate bias to overcome the limitations associated with the low breakdown voltage of 20nm and 12nm FDSOI CMOS technologies. We will conclude with the review of a 44Gb/s 3D module using a > 4V swing 28nm FDSOI transmitter flip-chipped on a silicon photonics interposer with integrated Mach-Zhender modulator.

**Sorin P. Voinigescu** holds the Stanley Ho Chair in Microelectronics and is the Director of the VLSI Research Group in the Electrical and Computer Engineering Department at the University of Toronto. He is an IEEE Fellow and a world renowned expert on mm-wave and 100+Gb/s integrated circuits and atomic-scale semiconductor device technologies. Between 1994 and 2002 he was first with Nortel Networks and later with Quake Technologies in Ottawa, Canada. In 2008-2009 and 2015-16, he spent sabbatical leaves at Fujitsu Laboratories of America, Sunnyvale, California, at NTT's Device Research Laboratories in Atsugi, Japan, and at Robert Bosch GmbH in Germany, exploring technologies and circuits for 128GBaud fiber-optic systems, 300Gb/s mm-wave radio transceivers, and radar sensors.

Dr. Voinigescu co-founded and was the CTO of two fabless semiconductor start-ups: Quake Technologies and Peraso Technologies. He was a member of the ITRS RF/AMS Committee, of the ExCom of IEEE CISICS, and is a member of the ExCom of the IEEE BCTM. He received NORTEL's President Award for Innovation in 1996 and is a co-recipient of the Best Paper Award at the 2001 IEEE CICC, the 2005 IEEE CSICS, and of the Beatrice Winner Award at the 2008 IEEE ISSCC. His students have won numerous Best Student Paper awards, most recently at IEEE IMS 2017. In 2013 he was recognized with the ITAC Lifetime Career Award for his contributions to the Canadian Semiconductor Industry.

**F3: Circuits and Architectures for Wireless Sensing, Radar and Imaging**

**Organizer:** **Brian Ginsburg**, Texas Instruments, Dallas, TX

**Committee:** **Pedram Lajevardi**, Robert Bosch LLC, Palo Alto, CA

**Andrea Mazzanti**, Università di Pavia, Pavia, Italy

**Hayato Wakabayashi**, Sony, San Jose, CA

**Yuu Watanabe**, Waseda University, Kanagawa, Japan

**Alan Wong**, EnSilica, Abingdon, United Kingdom

Remote sensing has become an increasingly important area of development in the last few years. Various kinds of signals are used: electromagnetic waves at RF and mm-wave frequencies, infrared and visible light, and acoustic waves. These sensors also require sophisticated signal conditioning and signal processing to extract relevant information from background clutter. This forum gives an overview of circuits, sensors and entire systems that are based on these technologies.

**Integrated Circuits for Next-Generation Miniature Ultrasound Probes**

*Michiel Pertijis, Delft University of Technology, Delft, The Netherlands*

Medical ultrasound probes mounted at the tip of an endoscope or catheter are an important tool to diagnose cardiac conditions, and to guide catheter-based minimally-invasive interventions. Such mm-sized probes are currently capable of providing 2D cross-sectional images only. The capability to produce real-time 3D images would be very valuable, but calls for the integration of a 2D array of 1000+ transducer elements in the probe tip, a number far exceeding the number of cables that can be accommodated by the shaft of an endoscope or catheter. Therefore, in-probe integrated circuits are needed to locally reduce the number of channels. Several analog approaches have been reported, including multiplexing switches and sub-array beamformers. These circuits rely on analog links to the imaging system, where the received echo signals are digitized and processed. This talk explores the possibility of in-probe digitization, to improve signal quality and open up the possibility of applying in-probe digital compression and multiplexing techniques to significantly reduce cable count. Due to the stringent size and power-consumption constraints, this requires state-of-the-art application-specific ADCs. Several examples will be presented, including a beamforming SAR ADC and an element-matched delta-sigma ADC.

**Michiel A. P. Pertijis** received the M.Sc. and Ph.D. degrees in electrical engineering (both cum laude) from Delft University of Technology, Delft, The Netherlands, in 2000 and 2005, respectively. From 2005 to 2008, he was with National Semiconductor, Delft, where he designed precision operational amplifiers and instrumentation amplifiers. From 2008 to 2009, he was a Senior Researcher with imec / Holst Centre, Eindhoven, The Netherlands. In 2009, he joined the Electronic Instrumentation Laboratory of Delft University of Technology, where he is now an Associate Professor. He heads a research group focusing on integrated circuits for medical ultrasound and energy-efficient smart sensors. He has authored or co-authored two books, three book chapters, 12 patents, and over 80 technical papers.

Dr. Pertijis serves as an Associate Editor of the IEEE Journal of Solid-State Circuits (JSSC). He also served on the program committees of the International Solid-State Circuits Conference (ISSCC), the European Solid-State Circuits Conference (ESSCIRC) and the IEEE Sensors Conference. He received the ISSCC 2005 Jack Kilby Award for Outstanding Student Paper and the JSSC 2005 Best Paper Award. For his Ph.D. research on high-accuracy CMOS smart temperature sensors, he received the 2006 Simon Stevin Gezel Award from the Dutch Technology Foundation STW. In 2014, he was elected Best Teacher of the EE program at Delft University of Technology.

**Emerging Electromagnetic-Acoustic Sensing and Imaging Beyond Radar and Ultrasound**

*Yuanjin Zheng, Nanyang Technological University, Singapore, Singapore*

Traditional electromagnetic sensing techniques (e.g. Radar and Lidar) and acoustic imaging techniques (e.g. microphone and ultrasound) have wide applications in military, automotive, consumer, medical, and healthcare fields. The Emerging Electromagnetic-Acoustic (EMA) technique combines the merits of electromagnetic sensing with acoustic imaging, and goes beyond to fuse the sensors. In this forum, we will deeply discuss the implementations, functions and limitations of the respective sensors from circuits to systems, and therein to demonstrate their emerging applications. We present a thorough overview of the realization of three types of sensors: (1) Low-power phase-array radar chips for Synthetic Aperture Radar (SAR) imaging, (2) Photoacoustics sensors for blood oxygen and blood glucose sensing, and (3) EMA systems for non-destructive sensing and testing. Detailed circuits and architectures to implement the SAR sensor are presented. For photoacoustics sensors, the key modules of fibre-coupled pulsed lasers, beamforming ultrasound transducers, and low-power low-noise signal acquisition circuits are designed and implemented. Furthermore, there is increasing interest to adopt microwave-induced thermoacoustics and magnetooacoustics sensors, and we will briefly present our implementation of resonant coil based EMA sensors for non-contact sensing and imaging.

**Yuanjin Zheng** received his B.Eng. from Xian Jiaotong University, P. R. China in 1993 with first class honors, M. Eng. from Xian Jiaotong University, P. R. China in 1996 with the honor of the best graduate student thesis award, and Ph.D. from Nanyang Technological University, Singapore in 2002. From July 1996 to April 1998, he worked at the national key lab of optical communication technology, University of Electronic Science and Technology of China. He joined the Institute of Microelectronics, A\*STAR in 2001 and was promoted to group technical manager. He has lead teams in developing various CMOS integrated circuits for wireless systems, such as Bluetooth, WLAN, WCDMA, UWB, wireless capsule imager, etc. Since July 2009, he joined Nanyang Technological University and is now an Associate Professor, developing various sensors (Radar, Lidar, EM acoustics) and hybrid circuits and devices (GaN, SAW, MEMS). He has published over 260 international journal and conference papers (including 7 ISSCC papers), 5 book chapters, with 22 patents filed/granted. He has been involved in organizing dozens of conferences as TPC chair and session chair.



### Systems and Algorithms For Millimeter-Resolution Imaging: From mm-Wave Radar to Multi-Physics RF-Ultrasound Approaches

*Amin Arbabian, Stanford University, Stanford, CA*

High-resolution RF to mm-wave imaging systems find applications in real-time radar imaging, situational awareness and navigational systems, as well as medical imaging and health-monitoring domains. This talk aims to explore a set of emerging imaging technologies and investigate both ‘top-down’ approaches that combine system design with new algorithms to overcome classical challenges, as well as new ‘bottom-up’ methodologies that use new multi-physics detection and sensing techniques to tackle limitations in contrast and resolution. In the first front we focus on mm-wave systems and investigate the transition between radar detection and full real-time imaging arrays. New joint systems-algorithms approaches to allow for scaling into larger high-resolution imaging arrays will be discussed. On the second front we will explore multi-physics approaches that combine RF and microwaves with ultrasonics to provide alternative tradeoffs in the contrast-resolution design space. Examples of new thermoacoustic systems, both semiconductor-based high-resolution immersion medical imaging to full non-contact techniques for interrogation and internal mapping of dispersive media will be discussed.

**Amin Arbabian** received his Ph.D. degree in EECS from UC Berkeley in 2011 and in 2012 joined Stanford University, as an Assistant Professor of Electrical Engineering, where he is also a School of Engineering Frederick E. Terman Fellow. His research interests are in mm-wave and high-frequency circuits and systems, imaging technologies, and ultra-low power sensors and implantable devices. Prof. Arbabian currently serves on the steering committee of RFIC, the technical program committees of RFIC and ESSCIRC, and as associate editor of the IEEE Solid-State Circuits Letters (SSC-L) and the IEEE Journal of Electromagnetics, RF and Microwaves in Medicine and Biology (J-ERM). He is the recipient or co-recipient of the 2016 Stanford University Tau Beta Pi Award for Excellence in Undergraduate Teaching, 2015 NSF CAREER award, 2014 DARPA Young Faculty Award (YFA) including the Director’s Fellowship in 2016, 2013 Hellman faculty scholarship, and best paper awards from several conferences including ISSCC (2010), VLSI Circuits (2014), RFIC symposium (2008 and 2011), ICUWB (2013), PIERS (2015), and the MTT-S BioWireless symposium (2016).



### Portable Continuous-Wave Radar for Non-Contact Sensing and Localization

*Changzhi Li, Texas Tech University, Lubbock, TX*

Wireless sensors with embedded control and communication links have the potential to improve the quality of service in healthcare, infrastructure maintenance, and energy conservation. This presentation provides an overview of our research activities on smart RF sensors aided with advanced technologies such as beamforming, inverse synthetic aperture radar, and flexible electronics. In a smart house, the sensors ensure human well-being and energy efficiency by tracking users’ vital signs, location, gait, gestures, and activities. In cancer radiotherapy, we investigate non-contact tumor tracking, which dynamically target a tumor with a radiation beam when the tumor moves due to the respiratory movement of a patient. In structural health monitoring, our RF sensors advance infrastructure maintenance by remotely monitoring structural vibrations and movements, as aging infrastructure remains a national concern with widespread impacts on the quality of our daily lives.

**Changzhi Li** received the Ph.D. degree in electrical engineering from the University of Florida, Gainesville, FL, in 2009. He is an Associate Professor at Texas Tech University. His research interests include biomedical applications of microwave/RF, wireless sensor, and RF/analog circuits.



### Radar Circuits and Systems for Vital Signs Monitoring

*Jørgen Andreas Michaelsen, Novelda, Oslo, Norway*

There is a growing demand for smarter and less obtrusive devices that seamlessly interact with users and their surroundings. These devices rely on sensor data to infer contextual information, such as the presence, position, and movement trajectories of humans and objects in the scene. Non-contact sensing of vital signs such as breathing and heart rate allows for robust presence detection, even for sleeping subjects. Furthermore, vital signs data can be used for estimating sleep stages, emotional states, and health parameters, as well as safety-enhancing features, including drowsiness detection for drivers and machine operators.

While research into vital signs monitoring using radars goes back more than four decades, solutions have only recently become practical with advances in low-power integrated circuits that enable compact and inexpensive radar sensors. This presentation will focus on IR-UWB radar systems in comparison to radar technologies such as CW, FMCW, and SFCW at different bands of operation. In particular, circuit complexity and performance trade-offs, interference sources and mitigation, and regulatory topics will be emphasized.

**Jørgen Andreas Michaelsen** received his M.Sc. degree and Ph.D. degree from the University of Oslo in 2006 and 2014, respectively. Since 2013, he has been with the R&D department at Novelda, Norway, as an IC design engineer, working on high-speed analog and mixed-signal design, focusing on high-speed data converters for radar systems.



### Challenges and Opportunities in Automotive Radar Systems

*Karthik Ramasubramanian, Texas Instruments, Bangalore, India*

The talk will start with a brief introduction to FMCW radar and discuss the present status of automotive radar and the recent industry trends in this space. Radar is an important component in the industry's quest towards fully autonomous vehicles. Given the rapid progress being made towards autonomous vehicles and the importance of radar to enable the ultimate vision of Level-5 autonomy, addressing some of the critical challenges faced by radar technology is of high interest. In this context, this talk will highlight the key system-level challenges such as limited angular resolution in azimuth and elevation, the issue of interference across many radars, object identification using radar, etc. and describe the ongoing developments to address these challenges both from a silicon and systems perspective.

**Karthik Ramasubramanian** is a Distinguished Member of Technical Staff with Texas Instruments India Pvt. Ltd. and the manager of the Radar Systems team in Bangalore. He received the Bachelor of Technology degree in Electrical Engineering from the Indian Institute of Technology, Madras in 1997 and the M.S. degree in Electrical Engineering from The Ohio State University, Columbus in 1999. He has 18 years of experience in industry working on multiple technologies including radar, GPS and wireless LAN. His primary interest is in the area of signal processing and communications. Karthik has been involved with automotive radar technology since 2013 and has made key contributions to the system and algorithm design for TI's 77GHz radar devices. Prior to that, Karthik has worked on GPS receiver design for over a decade and contributed to multiple generations of successful GPS products at TI.



### What's the Best Technology and Architecture for your Time of Flight System?

*David Stoppa, ams AG, Rueschlikon, Switzerland*

The increasing demand for environmental awareness within many applications in IoT, consumer and automotive markets, dramatically boosted the development of a new generation of time-of-flight systems and sensors in the past few years. Several competing technologies and measuring techniques are now available and it is becoming more and more challenging to identify the optimal solution fitting the application requirements in terms of distance precision, spatial resolution, power consumption, module size, latency, background immunity, etc.

The main goal of this talk is to provide an in-depth overview of state-of-the-art ToF detectors technologies focusing on the main advantages and disadvantages of the two key competing detector classes, i.e. photo-demodulators or SPADs, and their natural implementation in indirect-/direct-ToF systems through key examples from commercial and academic implementations.

**David Stoppa** (SM'12-M'97) received the Laurea degree in Electronics Engineering from Politecnico of Milan, Italy, in 1998, and the Ph.D. degree in Microelectronics from the University of Trento, Italy, in 2002. In 2017 he joined AMS where he is in charge of the research and development of next-generation range-sensors. From 2014 to 2017 he has been the head of the Integrated Radiation and Image Sensors research unit at FBK where he has been working as a research scientist since 2002 and as group leader of the Smart Optical Sensors and Interfaces group from 2010 to 2013. From 2002 to 2012 he has been teaching at the Telecommunications Engineering faculty of the University of Trento, courses of Analogue Electronics and Microelectronics. His research interests are mainly in the field of CMOS integrated circuits design, image sensors and biosensors. He has authored or co-authored more than 120 papers in international journals and presentations at international conferences, and holds several patents in the field of image sensors. Since 2011 he served as program committee member of the International Solid-State Circuits Conference (ISSCC) and the SPIE Videometrics, Range Imaging and Applications conference, and was technical committee member of International Image Sensors Workshop (IISW) in 2009, 2013, 2015 and 2017. He was a Guest Editor for the IEEE Journal of Solid-State Circuits special issues on ISSCC'14 in 2015 and is serving as Associate Editor since 2017. Dr. Stoppa received the 2006 European Solid-State Circuits Conference Best Paper Award.



### Optical Phased Array LiDAR

*Michael Watts, Analog Photonics, Boston, MA*

We discuss the current challenges, opportunities, and advantages for chip-scale optical phased array based LiDAR. Starting from the LiDAR equation, we review the noise performance of phased array LiDAR with direct versus coherent detection in light of automotive requirements. The limits of current silicon photonic device performance and considerations of electrical drive and control of large-scale of optical phased arrays will be discussed. Finally, recent results on coherent silicon optical phased array based LiDAR chips will be presented.

**Michael R. Watts** is an Associate Professor in the Department of Electrical Engineering and Computer Science (EECS) at the Massachusetts Institute of Technology. Mike is currently on a teaching leave from MIT to serve as CEO of Analog Photonics, where he is developing chip-scale LiDAR and a silicon photonics Process Design Kit (PDK). Mike also serves as CTO of AIM Photonics, the \$600M public-private partnership to advance the state of US Manufacturing in Silicon Photonics. Mike joined the Massachusetts Institute of Technology (MIT) faculty in 2010 where his research interests include new Silicon Photonic applications including precision timing, frequency synthesis, and 3D displays. Prior to MIT, Mike was a Principle Member of Technical Staff at Sandia National Labs where he led their Silicon Photonic development effort from 2005-to-2010, focussing on ultralow power communications and sensing applications. Prior to Sandia, Mike was at MIT where he earned both his S.M. and Ph.D. degrees in Electrical Engineering with theses in Silicon Photonics, developing the first polarization independent microphotonic circuit. From 1996-to-1999, Mike was a Member of Technical Staff at Draper's Fiber Optics Group, and in 1996 Mike earned his BSEE from Tufts University.

## F4: Circuit and System Techniques for mm-Wave Multi-Antenna Systems



**Organizer:** **Pierre Busson**, ST Microelectronics, Crolles, France  
**Committee:** **Howard Luong**, Hong Kong University of Science and Technology, Hong Kong, China  
**Chih-Ming Hung**, MediaTek, Hsin-Chu, Taiwan  
**Harish Krishnaswamy**, Columbia University, New York, NY  
**Theodore Georgantas**, Broadcom, Athens, Greece  
**Patrick P. Mercier**, University of California San Diego, La Jolla, CA

The 5th generation wireless system (5G) is proposed as the next major revolution of mobile wireless technologies. Carrier frequencies in the mm-wave bands and MIMO/multi-antenna systems are expected to be extensively employed to achieve significantly enhanced data rate, spectral/spatial diversity/efficiency and minimized system latency. The design of commercial high-performance radio transceivers at mm-wave represents a major technical challenge. This forum is focused on current state-of-the-art and future directions of multi-antenna systems in the mm-wave bands, from both system architecture and circuit design perspectives. Key system integration aspects such as antenna design, packaging and built-in self-test will also be covered.



### Broadband Architectures and Multiport Antennas Co-Design for Frequency, Pattern and Spatial Diversity in mm-Wave MIMO arrays

*Kaushik Sengupta, Princeton University, Princeton, NJ*

Future communication infrastructure and networks are expected to operate over several disjointed frequency bands in the mm-wave frequency range, opening up a spectrum, which is orders of magnitude larger than we ever had access to. Efficient use of the available spectrum is key towards enabling extremely heterogeneous network infrastructure for future applications. Evidently, scalability of arrays to address the disjointed bands spreading across 28 to 76GHz and beyond becomes a critical issue, particularly for the mobile terminals. In this talk, we will present our approaches towards asymmetrical Tx architectures and on-chip multiport antenna combining to enable frequency and field programmability in an efficient fashion that loosens the classical trade-offs between output power, efficiency, spectral efficiency (linearity), and spectral reconfigurability. We will also discuss resultant system-level enhancements that come with such programmability.

**Kaushik Sengupta** Kaushik Sengupta received the B.Tech. and M.Tech. degrees in electronics and electrical communication engineering from the Indian Institute of Technology (IIT), Kharagpur, India, both in 2007, and the M.S. and Ph.D. degrees in electrical engineering from the California Institute of Technology, Pasadena, CA, USA, in 2008 and 2012, respectively. In 2013, he joined the Faculty of the Department of Electrical Engineering, Princeton University, Princeton, NJ, USA. His current research interests include high-frequency ICs, electromagnetics, and optics for various applications in sensing, imaging, and high-speed communication. Dr. Sengupta received the Young Investigator Program (YIP) Award from the Office of Naval Research in 2017, and the Charles Wilts Prize in 2013 from the Department of Electrical Engineering, Caltech, for outstanding independent research in electrical engineering leading to a Ph.D. He was thrice selected to the Princeton Engineering Commendation List for Outstanding Teaching in 2014, 2016 and 2017. He serves on the Technical Program Committee of the IEEE ESSCIRC, IEEE CICC and PIERS. He is a IEEE senior member and a co-recipient of the IEEE RFIC Symposium Best Student Paper Award (1st prize) in 2012 and the 2015 IEEE MTT-S Microwave Prize



### Multi-Beam Phased Arrays for 5G Systems

*Kazuaki Kunihiro, NEC, Kawasaki, Japan*

In 5G system, beamforming for increasing spectrum efficiency and millimeter wave communication with wider bandwidth are key technologies to enhance the network capacity. In this talk, we will describe the design and performance of two kinds of digital AAS (active antenna system) prototypes. First, we will introduce a sub-6 GHz full-digital massive-MIMO system including 64 antenna elements. The field trial for multiple-user downlink at indoor environment with this prototype will be presented. Next, we will show a 28-GHz, 480-elements digital beam-forming prototype with an EIRP of 68 dBm which is able to cover a macro-cell area. In this prototype, we adopted a direct IF architecture consisting of a novel FPGA-based bit-streamer to reduce the power consumption. The OTA measurements for the developed 28-GHz AAS prototype will be presented.

**Kazuaki Kunihiro** received the B.S. and M.S. degrees from the Tokyo Institute of Technology, Tokyo, Japan, in 1988 and 1990, respectively, and the D.E. degree from Nagoya University, Nagoya, Japan, in 2004. In 1990, he joined the NEC Corporation, Kawasaki, Japan, where he has been engaged in device simulation, modeling, and MMIC design of GaAs FETs, GaAs HBTs, and GaN FETs for wireless communications. His current interests include high-efficiency transmitter architecture/IC such as digital transmitters and millimeter-wave/sub-THz transmission systems for 4G/5G mobile base stations and backhaul. He is currently a senior principal researcher in System Platform Research Laboratories of NEC Corporation.



### Phased Arrays and 5G: The End of the Marconi Era is Near

*Gabriel Rebeiz, UCSD, San Diego, CA*

The use of directive communications, using phased-arrays, MIMO, or both, is increasing the data rate by 10 to 30 $\times$  and resulting in a new communication revolution. At the heart of this transformation is affordable phased-arrays based on a hybrid beamforming architecture with RF phase-shifting at the element level and digital beamforming at the sub-array level. The talk will present the recent advances in this area, from architectures to the recent silicon chips. Also, entire 5G phased-array systems with Gb/s links at hundreds of meters will be presented. The goal is to end the Marconi-Era model of broadcast communications (wide coverage base station to low-gain antennas) and increase the network capacity by 10 to 100 $\times$ .

**Gabriel Rebeiz** is Distinguished Professor, the Wireless Communications Industry Endowed Chair at UCSD, and Member of the National Academy (elected for phased arrays). He is an IEEE Fellow and received the IEEE Daniel Nobel Award for his work on RF MEMS, the NSF Presidential Young Investigator, MTT Microwave Prize (twice for phased-array topics), MTT Distinguished Educator, IEEE Antennas and Propagation John D. Kraus Award for the dielectric lens antenna, and the Harold Wheeler Award for multimode phased-array antennas. He also received the Amoco Teaching Award given to the best undergraduate teacher at the University of Michigan and the Jacobs ECE Teacher of the Year at UCSD. He is considered as one of the fathers of RF MEMS and tunable networks, low-cost silicon RFIC phased arrays, and mm-wave and THz antennas. Prof. Rebeiz has graduated 92 PhD students and post-docs, has written 700 IEEE publications, and has been referenced over 27000 times with an h-index of 75 (the highest in the world for RF/microwaves). He is an advisor to several of the large commercial and defense companies in the US.



### CMOS PA Design at mm-Wave Frequencies

*Patrick Reynaert, KU Leuven, Leuven, Belgium*

As the mm-wave frequency range gains importance, the design of power amplifiers (PAs) at these frequencies is becoming crucial. CMOS is the technology of choice for high-volume communication products. MIMO and beamforming are the enablers in meeting the required link-budget and throughput. All this puts more pressure on the required specifications of the PA, such as linearity, efficiency, output power, footprint, and more.

This presentation will discuss the challenges of designing a CMOS PA at mm-wave frequencies. Transistor layout optimization towards gain, stability, and output power will be discussed in detail. Power combining and device stacking will also be discussed as these are crucial to achieving sufficient output power from a low-voltage technology. At the architecture level, many options lie ahead, such as polar, envelope tracking, outphasing, and Doherty architectures. These will be reviewed in detail and their pros and cons for mm-wave operation will be presented. Many examples will be presented in 28nm and 40nm CMOS, covering a frequency range from 28 to 85GHz.

**Patrick Reynaert** received the Master of Electrical Engineering (ir.) and the Ph.D. in Engineering Science (dr.) from the University of Leuven, KU Leuven, Belgium in 2001 and 2006, respectively. During 2006-2007, he was a post-doctoral researcher at UC Berkeley. During the summer of 2007, he was a visiting researcher at Infineon, Villach, Austria. Since October 2007, he is a Professor at the University of Leuven, KU Leuven, department of Electrical Engineering (ESAT-MICAS). His main research interests include mm-wave and THz CMOS circuit design, high-speed circuits, and RF power amplifiers. Patrick Reynaert is a Senior Member of the IEEE and the chair of the IEEE SSCS Benelux Chapter. He serves or has served on the technical program committees of several international conferences including ISSCC, ESSCIRC, RFIC, PRIME, and IEDM. He has served as Associate Editor for Transactions on Circuits and Systems – I, and as Guest Editor for the Journal of Solid-State Circuits. He received the 2011 TSMC-Europpractice Innovation Award, the ESSCIRC-2011 Best Paper award, and the 2014 2<sup>nd</sup> Bell Labs Prize.



### Advances in mm-Wave Phased Arrays for Beamforming in 5G Systems

*Alberto Valdes-Garcia, IBM T. J. Watson Research Center, Yorktown Heights, NY*

This talk presents recent advances in architectures, Si-based circuits, and antenna-in-package designs that enable precise and agile beamforming at mm-wave frequencies without the need for calibration. Specifically, techniques for RF beamforming with orthogonal amplitude and phase control at each front-end element are described, and a phased-array antenna-in-package concept with a uniform antenna-array cavity is presented. To motivate the need for these advances, prior implementations of mm-wave beamforming blocks and antenna arrays are presented along with the calibration and performance limitations associated with them. A complete 28GHz phased-array antenna module (PAAM), recently co-developed by IBM Research and Ericsson for 5G applications is presented as an implementation example of the presented advances. The PAAM consists of 64 dual-polarized antennas driven by 128 independent RF phase-shifting transceiver front-ends and supports two simultaneous and independent 64-element beams in either TX or RX modes. Comprehensive beamforming measurements are discussed including +/- 50 degree beam scanning, <1.5 degree beam steering resolution, and tapering.

**Alberto Valdes-Garcia** is currently a Research Staff Member and Manager of the RF Circuits and Systems Group at the IBM T. J. Watson Research Center. He received the Ph.D. degree in Electrical Engineering from Texas A&M University in 2006. From 2006 to 2009, Dr. Valdes-Garcia served in the IEEE 802.15.3c 60GHz standardization committee. Since 2009 he serves as Technical Advisory Board member with Semiconductor Research Corporation (SRC), where he was the chair of the Integrated Circuits and Systems Sciences Coordinating Committee in 2011 and 2012. In spring 2013, he was also an Adjunct Assistant Professor at Columbia University. He holds 35 issued US patents with 30+ pending. His scholarly work (100+ authored or co-authored publications) has already received more than 3000 independent citations. He is a co-Editor of the book "60GHz Technology for Gbps WLAN and WPAN: From Theory to Practice," Wiley, 2011. Dr. Valdes-Garcia is the winner of the 2005 Best Doctoral Thesis Award presented by the IEEE Test Technology Technical Council (TTTC), the recipient of the 2007 National Youth Award for Outstanding Academic Achievements, presented by the President of Mexico, and a co-recipient of the 2010 George Smith Award presented by the IEEE Electron Devices Society. In 2013, he was selected by the National Academy of Engineering for its Frontiers of Engineering Symposium and in 2015 for its German-American Frontiers of Engineering Symposium. Within IBM, he is a co-recipient of an IBM Corporate Outstanding Innovation Award for the demonstration of wireless high-definition video links with 60GHz SiGe radios (2008), and the 2009 Pat Goldberg Memorial Award to the best paper in computer science, electrical engineering, and mathematics published by IBM Research for the work "Operation of Graphene Transistors at GHz Frequencies," Nano Letters, 2009.



### Power Amplifiers in Advanced Antenna Systems

*Ulf Gustavsson, Ericsson, Göteborg, Sweden*

Advanced antenna arrays are an important technology component for bringing 5G into the world. The technologies range from massive MU-MIMO for spectrum below 6GHz to analog/hybrid beamforming in higher frequency bands. With these technologies, new challenges arise in terms of circuit design and radio-signal processing related to efficient power amplification. In this talk, we will present some recent advances in the modeling and analysis of power-amplifier distortion in large antenna systems. The talk will be delivered in two parts. The first part will focus both on modeling power amplifiers under the influence of mutual coupling and on mitigation of these effects with advanced pre-distortion techniques. The second part will discuss the spatial distribution of power-amplifier distortion analyzed using second-order statistics through the Bussgang theorem. We will briefly discuss how this stochastic modeling technique may bring elements of circuit design into communication systems engineering.

**Ulf Gustavsson** received the M.Sc. degree in electrical engineering from Örebro University, Örebro, Sweden, in 2006, and the Ph.D. degree from the Chalmers University of Technology, Gothenburg, Sweden, in 2011. He is currently a Senior Specialist with Ericsson Research where his research interests include radio-signal processing techniques for hardware impairment mitigation and behavioral modeling of radio hardware for future advanced antenna systems. Dr. Gustavsson is currently also the lead scientist from Ericsson Research in the Marie Skłodowska-Curie European Industrial Doctorate Innovative Training Network, SILIKA (<http://silika-project.eu/>).



## Integration of mm-Wave Antennas Using Organic Packaging Technologies up to 240GHz

*Cyril Luxey, University of Nice, Valbonne, France*

Developing high-data-rate wireless networks is of paramount importance in meeting the growing demand of mobile services. With the upcoming transition to a 5G standard, large BWs are now required to provide data rates higher than 10Gb/s. Sub-THz frequencies are widely considered, since BWs of several 10s of GHz are easily accessible. Experimental 100Gb/s wireless links using a III-V photonic technology have been demonstrated above 200GHz. Indeed, photonic transmitters feature higher BWs than solid-state transmitters. However, broadband efficient integrated antenna-in-package solutions are needed either as stand-alone antennas for short communication links or as antenna-sources of larger quasi-optical radiators for long-distance communications (>100m) to operate in conjunction with solid-state or photonic transmitters.

This presentation will discuss antenna performance already obtained using organic packaging technologies and 3D-printed quasi-optical antenna solutions from 60 to 240GHz.

**Cyril Luxey** received the Ph.D. degree with honours in electrical engineering from the University Nice-Sophia Antipolis (UNS), France in 1999. From 2000 to 2002, he was with Alcatel, Mobile Phone Division, France, where he was involved in the design and integration of internal antennas for commercial mobile phones. Since 2009, he is a Full Professor at UNS. His current research interests include the design and measurement of mm-wave antennas, antennas-in-package, plastic lenses, and organic modules for mm-wave and sub-mm-wave frequency bands. Cyril Luxey is an IEEE Fellow. In October 2010, he was appointed as a Junior Member of the Institut Universitaire de France (IUF) institution. He was an associate editor for IEEE Antennas and Wireless Propagation Letters from May 2012 to May 2017. Cyril Luxey and his students received the H.W. Wheeler Award of the IEEE Antennas and Propagation Society for the best application paper of the year, 2006. He is also the co-recipient of the Jack Kilby Award 2013 of the ISSCC conference and several best papers in EuCAP2007, iWAT2009, LAPC 2012, LAPC 2013, ICEAA 2014, and iWEM 2014. Cyril Luxey is the recipient of the University Nice-Sophia Antipolis Medal (2014) and the recipient of the University Côte d'Azur medal (2016). Cyril Luxey has authored or co-authored more than 300 papers in refereed journals, in international and national conferences, and as book chapters. Cyril Luxey was the general chair of the Loughborough Antennas and Propagation Conference 2011, the award and grant chair of EuCAP 2012, the invited paper co-chair of EuCAP 2013, and the TPC chair of EuCAP 2017 conference in Paris. Since 2015, he is a member of the IEEE AP-S Education committee.



## New Wave SiP for mm-Wave

*CP Hung, Advanced Semiconductor Engineering Group, Kaohsiung, Taiwan*

Millimeter-wave (mm-wave) band from 28 to 300GHz has been identified as an important technology to provide extreme throughput and low latency for applications, such as fifth-generation (5G) mobile communications (28/39GHz band), 802.11ad (60GHz band), and automotive radar (77/79GHz band). System in Package (SiP) provides the user of the above applications with comprehensive solutions to optimize and differentiate their products to meet system requirements. This talk will review innovations in SiP technologies, such as Fan-out, Hybrid Passives (HyPas), Through Silicon/Glass/Mold Via (TSV/TGV/TMV), etc, and will describe how these solutions are verified to achieve higher bandwidth connectivity, smaller form factor, increased functionality, and mixed nodes, thereby making them very important in the mobile and big data applications.

**CP Hung** is currently the VP of Corporate R&D, ASE Group (Advanced Semiconductor Engineering, Inc.), leading teams for next generation products development with integrated technologies and enabling chip, package, as well as system holistic integrating solutions. He had several management experiences in ASE including VP of Corporate Design, VP of Central Engineering/Business Development and VP of Logistic Service Integration in ASE Kaohsiung. He holds 50 patents on IC packaging structure, process, substrate, and characterization technology, and he also has published over 32 conference/ journal papers.



## Non-Invasive Calibration and Built-In Self Test (BIST) for Phased-Array Systems

*Jose Luis Gonzalez, CEA-LETI-MINATEC, Grenoble, France*

In the last years, an increased number of wireless links use phased-array antennas in order to meet the range requirements while at the same time providing the capability to steer the antenna beam to track users or to automatically align links. This poses an important challenge for testing since such systems integrate the antennas and the transceivers in compact modules, which complicates the access to internal nodes. For example, 5G systems are following in this direction with envisaged carrier frequencies beyond 24GHz. Built-in testing and built-in self-testing (BIST) for RFICs is an attractive solution in this context that can also be used for calibration and self-trimming. However, with increasing carrier frequencies even an internal access to high frequency outputs of the circuits, especially at mm-wave frequencies, becomes very challenging if a minimum impact on the circuit performance has to be preserved. Non-invasive, contact-less techniques using indirect measurements, such as the local temperature increase, have been recently proposed to tackle this issue. Other recent BIST and calibration techniques are based on replica circuits. We will introduce in this talk these innovative non-invasive testing and calibration techniques for mm-wave integrated front-ends.

**José Luis González** is currently a Senior Expert and deputy laboratory head and RFIC and mmW Research Engineer at CEA/LETI, Grenoble, France, and invited lecturer at Helma Engineering School, Grenoble-Alpes University. Till 2011, he was a Full-Time Associate Professor at the Department of Electronic Engineering, Univ. Politecnica of Catalonia (UPC). He regularly collaborates as a reviewer in the journals, such as IEEE Transactions of Circuits and Systems (I & II), IEEE Microwave and Wireless Components Letters, IEE Electronics Letters, IEEE Transactions on Microwave Theory and Techniques, among others, and serves regularly on the technical program committees of several international conferences, including ESSCIRC from 2011 to 2016. He is the author of two books, a book chapter, 31 international journal papers, and more than 70 conference papers. He holds 13 patents. His research interests include very-large-scale integration design and test, mixed-signal/RF, and mm-wave ICs, silicon photonics, and signal and power integrity in SoC and RFICs.

## **F5: Advanced Optical Communication: From Devices, Circuits, and Architectures to Algorithms**



**Organizer:** **Bo Zhang**, *Broadcom, Irvine, CA*

**Committee:** **Frederic Gianesello**, *STMicroelectronics, Crolles, France*

**Simone Erba**, *STMicroelectronics, Pavia, Italy*

**Mounir Meghelli**, *IBM Thomas J Watson Research Center, Yorktown Heights, NY*

**Azita Emami**, *California Institute of Technology, Pasadena, CA*

**Takayuki Shibasaki**, *Fujitsu Laboratories, Kawasaki, Japan*

Since the invention of optical fiber in the 1970's, optical communication has been changing the landscape of telecommunication and data communication worldwide with its ultra-broad bandwidth and long haul transmission capabilities. It connects people around the world through submarine inter-continent optical cables, is the backbone of metro area networks, and is essential for data center network connectivity. Today, cost effective 100Gb/s optical links on a single fiber, using either III-V based optical devices or silicon photonics, are readily available for few meters to few kilometers connectivity solutions inside the data center and between data centers, while next generation links are poised to reach 400Gb/s. In this forum, the current state-of-the-art of optical communications will be reviewed, including advances in long-haul transport, progress in silicon photonics covering transceivers, packaging, assembly and test, progress in high order modulation schemes and signal processing, description of 56Gb/s and beyond electrical serial interfaces, and closing with a presentation on optical backplane technology.



### **Scalable Optical Transport Network with Capacity Over One Petabit per Second**

*Yutaka Miyamoto, NTT, Yokosuka, Japan*

This talk describes the past, present, and future of high-capacity optical transport technology to support the scalable evolution of broadband services. Future space division multiplexing (SDM)-based optical networks are promising to overcome the physical limits of today's single-mode fibers. They will achieve high capacities of over 1Pb/s in a single strand of fiber, a 100 fold increase in capacity, and node throughputs of over 10Pb/s. For the installation of such SDM and WDM transport systems in limited floor space, it is indispensable to enhance the transmission performance, compactness, and energy efficiency of the optical transceiver. Low-power massive integration technologies of both electronic circuits and photonic circuits, such as silicon photonics, are needed as key enablers. Novel analog electrical and optical preprocessing is described for reducing the digital signal processing complexity and enhancing the bandwidth of CMOS electronics.

**Yutaka Miyamoto** received the B.E. and M.E. degrees in electrical engineering from Waseda University, Tokyo, Japan, in 1986 and 1988, respectively. He joined NTT Transmission Systems Laboratories, Yokosuka, Japan, in 1988, where he engaged in research and development on high-speed optical communications systems including the 10Gb/s first terrestrial optical transmission system (FA-10G) using EDFA inline repeaters. He then joined NTT Electronics Technology Corporation between 1995 and 1997, where he engaged in the planning and product development of high-speed optical modules at the data rate of 10Gb/s and beyond. Since 1997, he has been with NTT Network Innovation Labs, where he has contributed in the research and development of optical transport technologies based on 40/100/400Gb/s channels and beyond. He is now a senior distinguished researcher at NTT Laboratories and director of Innovative Photonic Network Research Center of NTT Network Innovation Laboratories, where he has been investigating and promoting the future scalable Optical Transport Network with the Pb/s-class capacity based on innovative transport technologies such as digital signal processing, space division multiplexing and cutting-edge integrated devices for photonic preprocessing. He received the Dr. Eng. degree in electrical engineering from Tokyo University. He currently serves as Chair of the IEICE technical committee of Extremely Advanced Optical Transmission (EXAT). He is a member of the IEEE, and a Fellow of IEICE.



### **Insights Into Silicon Photonics Electro-Optical Transceiver Front-Ends**

*Enrico Temporiti, STMicroelectronics, Pavia, Italy*

Silicon photonics technology enables cost reduction and miniaturization of electro-optical interfaces. In this talk we will review the main applications driving the industrialization of silicon photonics and we will analyze the manufacturing of a silicon photonics platform in 300mm wafers, from a user perspective. The challenges entailed by the applications will be addressed and recently realized ICs achieving state-of-the-art performance will be presented.

**Enrico Temporiti** received the Laurea degree in Electronic Engineering in 1999. In 2000 he joined STMicroelectronics, where is currently working as design manager within the CMOS ASIC R&D Team. His interests are in the field of CMOS analog and mixed-signal high-speed integrated circuits for wireless and wireline communications.



### New Paradigm Shift to PAM4 Signaling at 100/400G for Cloud Data Centers

*Frank Chang, Inphi, Thousand Oaks, CA*

Internet traffic has seen exponential growth driven by BW-hungry applications such as mobile, HDTV, IoT, social networking and cloud computing. In 2017, >90% of the global/IP WAN traffic will pass through data centers. Thus, special solutions need to be developed for high-speed connectivity inside and between data centers.

High-speed signaling using NRZ has approached speed limits above 50Gb/s, where it is extremely difficult to maintain power and spectral efficiency as well as performance over a variety of channels and applications. PAM4 is emerging as key technology to enable an upgrade in spectral efficiency and address cost constraints in optical systems by packing more bits per wavelength. So, the last several years have witnessed the introduction and commercialization of PAM4 signaling to replace NRZ for today's cloud data centers.

In this invited presentation, we will review the applications/performance of PAM4 signalling with real-time processing at 100G & 400G, with emphasis on distance objectives from 100m MMF, up to 10/40km SMF and even 80-100km OSNR limited connectivity. Going forward, we will address the strong momentum in migrating 400GbE transceiver design from 8 channels of 50Gb/s to 4 channels of 100Gb/s for short-reach fiber links. Besides we will also discuss the battle of direct detection vs. coherent in building the 400GbE connectivity between data centers.

**Frank Chang** is an expert of photonic IC technologies and optical networks in research, development, and technology innovation. He has been employed at Inphi's CTO Optics Office for Optics Interconnect since April 2013, after 11 years of service at Vitesse Semiconductors. He leads optical system engineering efforts for physical layer IC products involving high-speed drivers, TIAs and PAM4 PHYs for various 100/400G optical applications. He has over 20 years of working experience in the optical networking and communication IC industry. Prior to Vitesse, he held various senior project, architectural and management positions at Cisco/Pirelli, Maha Networks, and JDS Uniphase (now Lumentum & Viavi Solutions).

He is knowledgeable in signal integrity, EDC (electronic dispersion compensation), PMD, PHY, FEC, DSP and PAM4 chipsets and their application in optical networking. He has authored or co-authored over 90 peer-reviewed journal and conference articles, 4 book chapters and given numerous invited talks in the field. His OFC'16 paper on PAM4 was ranked into the top 20+ papers by Gazettabyte. He frequently represents Inphi (and previously Vitesse) to contribute to standard-setting bodies including IEEE 802.3bs, 802.3cd, 802.3bm, 802.3ba, 802.3av, OIF/ITU and FSAN/ITU Q2 for the definition of various optical interface specifications.

Dr. Chang has a core technical background within the optical area, he obtained his Ph.D. in Optoelectronics from the Ecole Polytechnique, University of Montreal, Canada for his research thesis on ultrashort optical pulse generation of 1550nm tunable solid-state lasers. He is an OSA Fellow and Sr. Member IEEE/LEOS. He is currently the OFC Committee chair (2018), the Industry Forum & Exhibit (IF&E) Co-chair for Globecom 2019 and among the Board of Directors and advisors for the Photonics Society of Chinese-Americans (PSC-SC).



### Mixed-Signal Electrical Transceivers for 56Gb/s and Beyond

*Elad Alon, University of California at Berkeley, Berkeley, CA*

As long as computing is done with electrical signals, all optical links will contain electrical links embedded within them. With current technologies/configurations, most optical links require (very) short electrical links between CMOS processing chips and optical components in a more specialized process. In this talk I will therefore describe key mixed-signal circuit techniques, relating especially to efficient implementation of multiple forms of equalization. I will also describe link architectures, particularly for clock and data recovery, that enable efficient implementation of electrical links operating at 56Gb/s and beyond.

**Elad Alon** is a Professor of EECS at UC Berkeley, as well as a co-director of the Berkeley Wireless Research Center. Prof. Alon's research focuses on energy-efficient integrated systems, and he has co-authored a number of papers that have received "best paper" awards from ISSCC, VLSI, and CICC.



### ADC/DAC/DSP-Based Transceivers for 400Gb and Beyond; Opportunities and Challenges from Kilometres to Hundreds of Kilometres Reach

*Ian Dedic, Acacia Communications, Wooburn Green, United Kingdom*

As shrinking CMOS process geometries bring down the power and cost of advanced DSP and high-speed ADCs and DACs, this opens up new shorter-reach application areas in addition to the longer-reach 100G+ transport applications, which have dominated in recent years. This will be driven by: 1) the need for more and more data bandwidth without a proportional increase in optical bandwidth; 2) the fact that channel imperfections need complex DSP compensation at shorter reaches as baud rates increase; and 3) the need to minimise the number of optical carriers to reduce cost and power consumption. This talk will cover the challenges of applying such complex technologies and taking advantage of their increased performance in systems where small size/power/cost has traditionally meant that much simpler technologies have previously been the only realistic solution. It will cover technology and integration tradeoffs as well as changes in thinking needed to enter such higher volume and lower cost markets.

**Ian Dedic**, born in 1959, after studying at Churchill College and Imperial College Ian joined GEC UK in 1983 to design mixed-signal CMOS ICs. At Fujitsu from 1990-2015 he specialised in ADCs/DACs up to >100Gs/s, joining Acacia in 2016 to work on technology for 400G and beyond. He holds more than fifty patents.



### Advanced Modulation and Signal Processing Empowering Optical Communication

*David P. Johnson, Ciena, Ottawa, Canada*

Digital optical receivers enable us to employ spectrally efficient and long distance optical fiber transmission. More recently, advances in low-power CMOS technology allow power and cost to be low enough to feasibly use digital signal processing (DSP) technology in shorter reach metro applications. Borrowing from wireless transmission, denser M-ary phase-shift keying and quadrature amplitude modulation (QAM) formats, receiver designs and forward error correction algorithms would have assumed to be easily adapted to create coherent optical, however the channel and its impairments present different challenges which cause the optical receiver architecture to be different. This presentation provides a brief history of optical communication, an overview of the components of a digital transmitter and receiver, coherent and direct detect comparison, popular modulation formats and constellations, an overview of optical impairments and approaches to DSP domain circuitry used to overcome them. Flexible rates, reach and spectral efficiency should lead to a Q&A discussion of the current challenges and tradeoffs in chip architecture.

**David P. Johnson** P. Eng --Digital ASIC Architect at Ciena in Wavelogic silicon design, received his Electrical Engineering Degree from the University of Ottawa (BaSc.1989) and has spent more than twenty years in hardware architecture and digital IC design. He currently leads an ASIC team doing the FPGA prototyping for early software integration and ASIC verification. He conducts several multivendor interop trials and co-simulations of hundred-gigabit (B100G) OTU standards and new client interface protocols like Flexible Ethernet and 400Gb Ethernet. He is a member of IEEE.



### Leveraging Semiconductor Technologies for Packaging, Assembly and Test of Advanced Silicon Photonics

*Peter De Dobbelaere, Luxtera, Carlsbad, CA*

Silicon photonics technology is based on leveraging semiconductor wafer processing to manufacture photonic integrated circuits. It has proven to be a commercially viable technology for optical interconnect in high performance computing and hyper-scale datacenter applications. However, for many photonic technologies there are challenges related with test and integration of electronics, light sources and optical fiber interfaces. We will show how semiconductor packaging, test and assembly techniques can be leveraged to address some of those challenges in the case of silicon photonics. These solutions not only allow high volume manufacturing of advanced optical transceivers, but also enable close integration of optical transceiver functions with ASICs.

**Peter De Dobbelaere**, from 1991 to 1995 was employed by IMEC, Belgium working on various projects including short reach optical interconnect and heterogeneous integration of III-V lasers with Si and polymer waveguides. From 1995 to 1999, he was with Akzo-Nobel N.V., The Netherlands and U.S., where he was engaged in product development and reliability of polymer-based thermo-optic waveguide switch devices. In 1999, he joined OMM Inc., San Diego, CA, where he was responsible for product and technology development of MEMS-based optical switches. His latest position there was CTO and Director of Product Engineering and Reliability. In 2004, he joined Luxtera, Inc., Carlsbad, CA, where he is currently responsible for technology development for silicon photonics.



### Optical Backplane Technology Using Fiber Wiring Sheet and Connectors

*Masahiro Aoyagi, AIST, Tsukuba, Japan*

In this presentation, an optoelectronic hybrid backplane technology is reported, which is applicable for 100G-400G Ethernet switch/routers. Optoelectronic packaging technologies, such as optical fiber wiring sheets with fine multimode fibers, compact EO/OE parallel link modules and small size right-angled mirrorless backplane connectors have been developed, which enables us to fabricate an optoelectronic hybrid backplane. These technologies are suitable for the development of high-performance backplanes and could be used for various types of switch/routers. To demonstrate their usefulness, an optoelectronic hybrid backplane prototype has been made in which an optical fiber wiring sheet with small size backplane connectors is attached to zone 3 of an ATCA-PICMG 3.0 backplane. The development has been done in a collaborative research team including AIST and eleven companies. IEC standardization activity and further recent development are also reported.

**Masahiro Aoyagi** received the B.E. and D.E. degrees in electronic engineering from Nagoya Institute of Technology, Japan, in 1982 and 1991, respectively. He joined Electrotechnical Laboratory, Tsukuba, Japan, in 1982, where he has been engaged in the research and development of Nb, NbN superconducting devices and Josephson integrated circuits. He worked in the special section Josephson computer technology from 1982 to 1994. He worked as a guest researcher in the National Physical Laboratory, Teddington, UK, from 1994 to 1995. He was a group leader of the High Density Interconnection Group, Nanoelectronics Research Institute (NeRI), National Institute of Advanced Industrial Science and Technology (AIST) from 2000 to 2010. He worked as a group leader in the national R&D projects of High Density Electronic System Integration from 1999 to 2004 and Functionally Innovative 3D-Integrated Circuit Technology from 2008 to 2012. He was a team leader of Opto-Electronic System Integration Collaborative Research Team, NeRI from 2004 to 2009. He was the Deputy Director of NeRI, AIST from 2012 to 2014. He is currently the Director of Collaboration Promotion Unit, TIA Central Office, AIST. His present research field is high-performance high-density 3D system integration technology.

Dr. Aoyagi was awarded the Tsukuba prize in 1991 for the development of Josephson prototype computer ETL-JC1. He has authored or co-authored 340 technical papers and has 150 patents. He was the chair of IEEE Components, Packaging and Manufacturing Technology (CPMT) Japan Chapter from 2009 to 2010. He is a member of the Board of Governors (BoG) for IEEE CPMT from 2012.

## F6: Advances in Energy Efficient Analog Design



**Organizer:** **Axel Thomsen**, Cirrus Logic Inc., Austin, TX  
**Committee:** **Bernhard Wicht**, Leibniz Universität Hannover, Hannover, Germany  
**Pieter Harpe**, Eindhoven University of Technology, Eindhoven, The Netherlands  
**Man Kay Law**, University of Macau, Macau, China  
**Young Cheol Chae**, Yonsei University, Seoul, Korea

Analog Design covers a wide range of applications. In this forum we focus on trends in analog design that are driven by hot applications. The desire for better wireless sensor nodes has driven advances in nano-power design and modeling, for infrastructure, power, and data-conversion circuits. Burst mode operation for oscillators, and better power efficiency for sensor interface circuits are also desired. This forum starts with an overview of limits to power efficiency and then dives into various design challenges that have been met by novel solutions.



### General Overview of Power Consumption Fundamental Limits in Analog Circuits

*Yannis Tsividis, Columbia University, New York, NY*

Micropower analog circuit design can be more efficient if guided by an understanding of fundamental limits, beyond which nature does not allow us to go. In this presentation, we review such fundamental limits. The relation between power dissipation and noise for several continuous-time analog blocks is discussed. A crucial distinction is made between signal-to-noise ratio and usable dynamic range. Ways to allow the signal-to-noise ratio to vary as needed, thus extending the usable dynamic range and resulting in corresponding power dissipation savings, are discussed.

**Yannis Tsividis** is Edwin Howard Armstrong Professor of Electrical Engineering at Columbia University, New York. His research deals with analog and mixed-signal integrated circuits at the device, circuit, and system level. He is a Life Fellow of IEEE. He received the 1986 IEEE W.R.G. Baker Award for the best IEEE publication, the 2003 ISSCC Lewis Winner Outstanding Paper Award, and the 2007 IEEE Gustav Robert Kirchhoff Award.



### Energy Efficient Nyquist-Rate ADCs

*Klaas Bult, Analog Design Consult B.V., Bosch en Duin, The Netherlands*

In large SoCs, data converters take a dominant position both from a performance as well as from an energy consumption point of view. The past two decades have shown a strongly intensified search for more power-efficient data converters, and in particular, power-efficient analog-to-digital converters. This presentation focuses on power efficiency of Nyquist-rate analog-to-digital converters and discusses what has been proposed in the open literature to reduce energy consumption, from a circuit as well as from an architectural point of view. To get a good grasp of how circuit and architectural choices affect power consumption, a method is introduced that allows a quick estimation of the power consumption of an ADC, based on the required SNDR, the sampling frequency, the technology as well as the chosen ADC architecture and circuit implementations. The method enables a comparison based on these choices and can show what their impact is on the power efficiency, without going through the elaborate design of several architectures. It also shows which recent inventions made a large impact on power efficiency and how these inventions can also be of use in other architectures.

**Klaas Bult** received an MSc and a PhD degrees from Twente University in 1984 and 1988, respectively. From 1988 to 1994, he worked as a Research Scientist at Philips Research Labs, where he worked on analog CMOS building blocks, mainly for application in video and audio systems. In 1993-1994, he was also a part-time professor at Twente University. From 1994 to 1996, he was an Associate Professor at UCLA, where he worked on analog and RF circuits for mixed-signal applications. In the same period, he was also a consultant with Broadcom Corporation, in Los Angeles, CA, and later in Irvine, CA, during which he started the Analog Design Group at Broadcom. In 1996, he joined Broadcom full-time as a Director, responsible for analog and RF circuits for embedded applications in broadband communication systems. In 1999, he became a Sr. Director and started Broadcom's Design Center in Bunnik, The Netherlands. In 2005 he was appointed Vice President and CTO of Central Engineering. As of 2016, he is an independent consultant on analog IC design, operating from The Netherlands.

Klaas Bult is an author of more than 60 international publications and holds more than 60 issued US patents. He is a Broadcom Fellow, an IEEE Fellow, was awarded the Lewis Winner Award for outstanding paper at ISSCC 1990, 1992 and 1997, was co-recipient of the Jan Van Vessem best European Paper Award at ISSCC 2004, and the Distinguished paper Award of ISSCC 2014. He was also awarded the ISSCC Best Evening Panel Award in 1997 and 2006 and the Best Forum Speaker Award at ISSCC 2011. He has served more than 12 years on the ISSCC Technical Program Committee, 18 years on the ESSCIRC Technical Program Committee and 7 years as a member of the ESSCIRC/ESSDERC Steering Committee.



### Energy-Efficient Amplifiers

*Gyu-Hyeong Cho, KAIST, Daejon, Korea*

This presentation focuses on energy-efficient amplifier design. Two of the most popular energy-efficient amplification architectures are thoroughly analyzed, namely the multistage amplifier and the recently evolved “single-stage” amplifier architectures. Multistage amplifiers exploit multiple cascaded high-gain stages to efficiently extend bandwidth and use Miller capacitance for frequency compensation. Frequency compensation is usually the main issue involved in the design process. “Single-stage” amplifiers, on the other hand, adopt multiple low-gain high-bandwidth amplifications to improve power efficiency. Both architectures are addressed in this presentation, and an intuitive design-oriented analysis method is developed for Miller compensation analysis, which lends significant insights into the various Miller compensation designs.

**Gyu-Hyeong Cho** received the PhD degree from Korea Advanced Institute of Science and Technology (KAIST) in 1981. He was with the Westinghouse R&D Center in Pittsburgh in 1982-1983, and with the University of Wisconsin at Madison as a Visiting Professor in 1989. He joined the Department of Electrical Engineering at KAIST in 1984 and has been a full Professor since 1991. His early research was in the area of power electronics until the year 2000. Later, he shifted to analog integrated circuit design. His current research interests include power-management ICs, energy-harvesting circuits, plasma power sources, touch sensors, and drivers for AMOLED displays. He has authored or coauthored over 100 international journals, 180 international conference papers and 80 patents. He received over 40 awards including Samsung Human-Tech Awards, Teaching Awards in KAIST, and the ISSCC Silkroad Award. He served as a member of ITPC at ISSCC from 2009 to 2012, and as guest Editor and Associate Editor of the IEEE Journal of Solid-State Circuits from 2012 to 2015. He is a Fellow of IEEE and received the Author-Recognition Award as one of the top 16 contributors at the ISSCC 60th Anniversary in 2013.



### High-Frequency Multiphase Hysteretic Switching Regulators for High Current Slew Rate SoCs

*D. Brian Ma, University of Texas, Dallas, TX*

Today, peak currents in modern SoCs have climbed to an unprecedentedly high level with aggressive slew rates on the order of 1A/ns. The drastic changing dynamic currents incur large supply voltage drooping effects and switching noise, which boost the risks of system black-out, power device breakdown, and circuit path failure. With at least two to three orders lower inductor current slew rates, conventional voltage regulators passively combat the challenge by using bulky filtering capacitors. However, this is simply untenable for stringent latency and PCB budgets. This talk presents recent research development on synchronized multiphase hysteretic switching regulators. The talk proposes an integrated effort to combat the challenge through an integrated cross-layer solution from device to system level. Specifically, it proposes novel hysteretic control schemes to reduce loop transient response times, and clock synchronization techniques to manage switching noise spectrum and multiphase current sharing. To facilitate the high frequency, wide power range operation, current sensing, and efficiency improvement strategies and circuits are also addressed.

**Brian Ma** is Distinguished Chair in Microelectronics and a full Professor in Electrical & Computer Engineering at the University of Texas at Dallas. Prior to his employment at UT Dallas, he was a faculty member with Louisiana State University from 2003 to 2004, and with University of Arizona from 2004 to 2010. Along his career path, he was awarded the Analog Devices Professorship (2004-2008), TxACE Chair Professorship (2010-2012), Erik Jonsson Distinguished Chair (2012-2017) and Distinguished Chair in Microelectronics (2017-present). Prof. Ma's research focuses on integrated power electronics, with primary interests on silicon, GaN and SiC based power IC solutions for big data, IoTs, automobile electronics and consumer electronics. His major research works are published in 1 book, 4 book chapters and over 160 journal and conference papers. Prof. Ma was a recipient of a United States National Science Foundation CAREER Award. He has received 9 research paper or design awards from international conferences and journals.



### Nano-Watt References and Oscillators

*Jae-Yoon Sim, Pohang University of Science and Technology, Pohang, Korea*

With the emergence of wearable and implantable technologies, there has been strong demand on the development of circuit techniques for ultra-low-power consumption while keeping traditional requirements of stable performance. Voltage reference generators and oscillators are two essential circuit blocks for supplying internal DC and timing references that should be normally turned on even during power-down modes.

This talk presents issues in the design of these circuits to overcome trade-off limitations between power consumption and immunity to process/voltage/temperature variations. Various approaches to achieve nano-Watt and sub-nano-Watt consumption for voltage references and oscillators are discussed.

**Jae-Yoon Sim** received the PhD degree in Electrical Engineering from Pohang University of Science and Technology (POSTECH) in 1999. Since 2005, he has been with POSTECH, where he is currently a Professor. He has served in the ITPC of ISSCC, VLSIC and ASSCC. His research interests include links and sensor interface circuits.



### Nano-Watt Ultra-Low-Voltage SAR ADC Design

*Chih Cheng Hsieh, National Tsing Hua University, Hsinchu, Taiwan*

For IoT applications, ADCs with moderate resolution and extremely low power consumption are in high demand. SAR ADCs have demonstrated a continuous improved power efficiency with technology migration and ultra-low voltage operation. This talk explores techniques to minimize the absolute power consumption to the nW level. First, design tradeoffs for ultra-low voltage operation are discussed. Then, power reduction techniques for the main building blocks, including DAC, comparator, and SAR logic, are covered. Examples from the literature and ongoing work are provided with a discussion of implementation issues.

**Chih-Cheng Hsieh** received the PhD degree from National Chiao-Tung University, Taiwan, in 1997. From 1999 to 2007, he was with Pixart Imaging Inc., Taiwan, where he led the Mixed-Mode IC department as a Senior Manager and helped the company to successfully conduct an IPO at 2007. In 2007, he joined National Tsing-Hua University, Taiwan, where he is currently a Full Professor. His research interests include low-voltage low-power ADC and CMOS image sensor IC design.



### Nanopower DC-DC Converters: From Harvesting Interface to Power-Supply Applications

*Gaël Pillonnet, CEA-LETI-MINATEC, Grenoble, France*

Power management is a key challenge in the next SoC generation where circuits reach nano-power average consumption and the energy storage capacity is strongly limited. As the power-range of DC-DC converters is moving from mW-to-W to nano-to-mW, integrated circuits dedicated to power management (harvesting interfaces, power supplies...) are facing new challenges to maintain high power efficiency, low die- and footprint-area, low quiescent current, and to address large power- and voltage-dynamic with duty-cycled operation. Thus, this talk gives a system- and circuit-level overview of power management circuits and surrounding key-elements (passives, micro-storages, harvesters...) suited to smartly manipulated sub-mW power.

**Gaël Pillonnet** received a PhD from INSA Lyon, France in 2007, and was with STMicroelectronics from 2004 to 2008. Then, he was an Associate Professor at University of Lyon from 2009 to 2013. During this period, he did a sabbatical year at University of California, Berkeley. He is now full-time researcher in CEA-LETI, Grenoble. His research interests are micro-energy harvesting interfaces, sub-mW power supplies, sub-W actuation circuits, and micromechanical systems.



### Energy-Efficient Clock Generation for IoT Applications

*Ming Ding, imec - Holst Centre, Eindhoven, The Netherlands*

Duty cycling has become popular to reduce the overall power consumption in IoT systems and to extend the battery life time. However, this requires energy-efficient clock generation. In this talk, the role of clocking in the system level and the technical challenges for on-demand burst mode operations are discussed. Also, an overview of different state-of-the-art low-energy clock-generation techniques and their performance trade-offs in terms of frequency, stability and noise, are provided. Afterwards, we highlight few clock generation circuit examples to show how the challenges can be addressed.

**Ming Ding** received the BSc degree in 2009 from Huazhong University of Science and Technology, China, and the MSc Degree (Cum Laude) in 2011 from Eindhoven University of Technology, The Netherlands. In 2011, Ming joined Holst Centre/imec as a researcher. His research interest includes low-power clock generation, data converters, and ultra-low-power wireless transceiver design for IoT applications. He has authored or co-authored 10+ papers in ISSCC, VLSI, RFIC, JSSC, and he holds several patents.

# EE1: Student Research Preview (SRP)

The Student Research Preview (SRP) will highlight selected student research projects in progress. The SRP consists of 25 one-minute presentations followed by a Poster Session, by graduate students from around the world, which have been selected on the basis of a short submission concerning their on-going research. Selection is based on the technical quality and innovation of the work. This year, the SRP will be presented in three theme sections: Communications and Power; Deep Learning and Biomedical Circuits; Memory, Sensors, and Mixed-Signal Circuits.

The Student Research Preview will include a brief talk by a distinguished member of the solid-state circuits community, Professor Tom Lee, Stanford University. SRP begins at 7:30 pm on Sunday, February 11th. SRP is open to all ISSCC registrants.



**Chair: SeongHwan Cho**  
KAIST



**Secretary: Denis Daly**  
*Omni Design Technologies*

## Session 1: Communications and Power



**Yoonmyung Lee**  
*SungKyunKwan University, Korea*



**Shahriar Mirabbasi**  
*University of British Columbia, Canada*



**SRP-1.1**  
Jahoon Jin  
Sungkyunkwan University, Korea



**SRP-1.2**  
Xingqiang Peng  
University of Macau, Macau



**SRP-1.3**  
Daniele Montanari  
University of Pavia, Italy



**SRP-1.4**  
Praveen M.V.  
IIT Madras, India



**SRP-1.5**  
Kai Xu  
University College Dublin, Ireland



**SRP-1.6**  
Younghyun Lim  
Ulsan National Institute of Science and Technology (UNIST), Korea



**SRP-1.7**  
Mao-Ling Chiu  
National Taiwan University, Taiwan



**SRP-1.8**  
Hyeonji Lee  
Sungkyunkwan University, Korea

**Chair**  
**Secretary**  
**Advisor**  
**Advisor**  
**Media/Publications**  
**A/V**

SeongHwan Cho  
Denis Daly  
Anantha Chandrakasan  
Jan Van der Spiegel  
Laura Fujino  
Trudy Stetzler

KAIST  
Omni Design Technologies  
MIT  
University of Pennsylvania  
University of Toronto

### COMMITTEE MEMBERS

Jason Anderson  
Masoud Babaie  
Andrea Baschirotto  
Ben Calhoun  
SeongHwan Cho  
Hayun Chung  
Denis Daly  
Shidhartha Das  
Andreas Demosthenous  
Chun-Huat Heng  
Makoto Ikeda  
Seulki Lee  
Yoonmyung Lee  
Salvatore Levantino  
Qiang Li

University of Toronto, Canada  
Delft University of Technology, Netherlands  
University of Milan-Bicocca, Italy  
University of Virginia, VA  
KAIST, Korea  
Korea University, Korea  
Omni Design Technologies, MA  
ARM, United Kingdom  
University College London, United Kingdom  
National University of Singapore, Singapore  
University of Tokyo, Japan  
IMEC-NL, Netherlands  
SungKyunKwan University, Korea  
Politecnico di Milano, Italy  
University of Electronic Science & Tech., China

Shih-Chii Liu  
Shahriar Mirabbasi  
Tinoosh Mohsenin  
Cormac O'Connell  
Mondira Pant  
Shanthi Pavan  
Jae-sun Seo  
Mingoo Seok  
Farhana Sheikh  
Bing Sheu  
GuoXing Wang  
Jeffrey Weldon  
Chung-Yu Wu  
Jerald Yoo  
Samira Zaliasl

University of Zurich /ETH Zurich, Switzerland  
University of British Columbia, Canada  
University of Maryland, MD  
TSMC, Canada  
Intel, MA  
Indian Institute of Technology, India  
Arizona State University, AZ  
Columbia University, NY  
Intel, OR  
Chang Gung University, Taiwan  
Shanghai Jiao Tong University, China  
University of Hawaii, HI  
National Chiao Tung University, Taiwan  
National University of Singapore, Singapore  
Ferric, New York, NY

## Session 2: Deep Learning and Biomedical Circuits



**Jae-sun Seo**  
*Arizona State University, Tempe, AZ*



**Tinoosh Mohsenin**  
*University of Maryland, Baltimore, MD*



**SRP-2.1**  
Jianxun Yang  
Tsinghua University, China



**SRP-2.2**  
Zhewei Jiang  
Columbia University, United States



**SRP-2.3**  
Shuo-An Huang  
National Taiwan University, Taiwan



**SRP-2.4**  
Huwan Peng  
University of Washington, United States



**SRP-2.5**  
Pyungwoo Yeon  
Georgia Institute of Technology, United States



**SRP-2.6**  
Zeliang Wu  
Inst. of Microelectronics, Tsinghua University,  
Beijing, China



**SRP-2.7**  
Yuting Hou  
Shanghai Jiao Tong University, China



**SRP-2.8**  
Sanfeng Zhang  
University of Electronic Science and Technology of China, China

## Session 3: Memory, Sensors and Mixed-Signal Circuits



**Cormac O'Connell**  
*TSMC, Canada*



**Samira Zaliasi**  
*Ferric, New York, NY*



**SRP-3.1**  
Siming Ma  
Harvard University, United States



**SRP-3.2**  
Dhruv Patel  
University of Toronto, Canada



**SRP-3.3**  
Seyedhamidreza Motaman  
Pennsylvania State University, United States



**SRP-3.4**  
Xiaopeng Zhong  
Hong Kong University of Science and Technology,  
Hong Kong



**SRP-3.5**  
Yi Luo  
University of British Columbia, Canada



**SRP-3.6**  
Sujin Park  
KAIST, Korea



**SRP-3.7**  
Jiaji Mao  
State-Key Lab of Analog and Mixed-Signal VLSI,  
University of Macau, Macau



**SRP-3.8**  
Bangan Liu  
Tokyo Institute of Technology, Japan



**SRP-3.9**  
Xiaofeng Yang  
University of Macau, Macau

## Poster Session



**Farhana Sheikh**  
*Intel, Hillsboro, OR*



**Mondira Pant**  
*Intel, Hudson, MA*

## EE2: Workshop on Circuits for Social Good

|                   |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
|-------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| <b>Chair:</b>     | Vivienne Sze, Massachusetts Institute of Technology, Cambridge, MA                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| <b>Committee:</b> | <b>Alison Burdett</b> , Sensium Healthcare, Abingdon, Oxfordshire, United Kingdom<br><b>Sonia Leon</b> , Intel, Santa Clara, California<br><b>Rikky Muller</b> , University of California, Berkeley, Berkeley, CA<br><b>Farhana Sheikh</b> , Intel, Hillsboro, OR<br><b>Yildiz Sinangil</b> , Apple, Cupertino, CA<br><b>Trudy Stetzler</b> , Halliburton, Houston, TX<br><b>Ingrid Verbauwheide</b> , KU Leuven, Leuven, Belgium<br><b>Alice Wang</b> , MediaTek, San Jose, California<br><b>Rabia Tugec Yazicigil</b> , Massachusetts Institute of Technology, Cambridge, MA |
|                   |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
|                   |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
|                   |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |

The Workshop on Circuits for Social Good highlights various ways that circuits can help address some of the most important challenges facing society today, ranging from health care to energy conservation.

The program aims to give a broad perspective on how one can have meaningful societal impact. It begins with several keynotes and invited talks from industry, academia and startups, followed by interactive round-table discussions on topics, including machine learning, medical devices, next generation communications, security and IoT, as well as discussions on career paths in research, product development, and entrepreneurship.

### KEYNOTES



#### Winning the Game in a Male-Dominated Industry

6:00 - 6:30 PM

**Teresa H. Meng**, Reid Weaver Dennis Professor in Electrical Engineering, Emerita, Stanford University, Palo Alto, CA  
Founder of Atheros Communications

In our male dominated industry, gender discrimination can take many forms. Identifying how and when discrimination occurs is the first step in navigating through our careers. To reach our goals, it is necessary to choose which of the gender discrimination battles to fight. Winning the game requires strategic adaptability and a high degree of tenacity. This talk will cover some of my personal experiences in overcoming these challenges.



#### Low-Power Design: How Can We Help Become Green?

6:30 - 7:00 PM

**Nevine Nassif**, Intel Fellow, Hudson, MA  
Computational devices, now synonymous with our everyday lives are present from our smart-phones to the large-scale servers that power industries from agriculture, transport and medicine to entertainment. This new computational age stems from the advances in performance and computational power of processors seen in recent years, which allows data processing on a scale unimaginable only a few decades ago. However, these algorithms require more and more powerful compute engines as well as data center capacity, which has led to an increase in the carbon footprint of our computing infrastructure. As engineers we need to continually deliver increasing performance at the same or lower power point. I will offer an overview of power reduction techniques, circuits, and technologies that we have developed over the past few years that enable the path to ecologically responsible computing.

### INVITED TALKS



#### Pioneering Ultra-Low Power Technologies to Empower Personal Healthcare

7:00 - 7:15 PM

**Esther Rodriguez-Villegas**, Professor/Chair in Low Power Electronics at Imperial College London, United Kingdom  
Founder and CEO at Acurable; Co-Founder and CSO at TainiTec

Power consumption has been an ever-present design challenge in integrated circuit design for well over a decade, in a wide range of applications. However, when it comes to medical technologies, power consumption becomes more than just a specification parameter to be taken into account in the design of the individual circuit blocks. To end up with a successful low power medical device, circuits and systems IC designers need to be aware of issues that go beyond the typical design constraints. Optimum solutions require knowledge, amongst others, of the specific clinical application and the medical regulatory constraints. This talk will share part of my personal journey as a designer of ultra low power medical systems, and the lessons I learnt along the way.



#### Driving A Ground-Breaking Ultrathin Flexible Printed Battery to Market – My Journey From Technologist to Entrepreneur

7:15 - 7:30 PM

**Christine Ho**, Co-Founder and CEO at Imprint Energy, Alameda, CA

Bringing a new technology to market is hard. So is transforming from researcher and technologist to an entrepreneur. I will be sharing my personal experiences: from inventing a revolutionary battery technology in the lab, making the emotional leap to start Imprint Energy, growing entrepreneurial muscles and instincts, and ultimately driving Imprint's battery product to market.

## ***EE2: Workshop on Circuits for Social Good***

### **Research and Career Round Tables (Talk to an Expert!)**

**7:30pm - 8:00pm**  
Short Pitches from Each Table

**8:00pm - 9:00pm**  
Round Table Discussions

#### **Next-Generation Communications**

Alyssa Apsel, Professor at Cornell, Ithaca, NY  
Azita Emami, Professor at Caltech, Pasadena, CA

#### **Machine Learning & Multimedia Systems**

Vivienne Sze, Associate Professor at MIT, Cambridge, MA  
Marian Verhelst, Assistant Professor at KU Leuven, Leuven, Belgium

#### **Medical Devices and Applications**

Rikky Muller, Assistant Professor at University of California, Berkeley, Berkeley, CA  
Esther Rodriguez-Villegas, Professor at Imperial College London, London, United Kingdom

#### **Security and IoT**

Edith Beigne, Senior Scientist at CEA-LETI, Grenoble, France  
Ingrid Verbauwhede, Professor at KU Leuven, Leuven, Belgium

#### **Careers in Industry**

Andreia Cathelin, Fellow at ST Microelectronics, Crolles, France  
Yildiz Sinangil, Circuit Designer at Apple, Cupertino, CA  
Trudy Stetzler, Engineering Project Manager at Halliburton, Houston, TX  
Bich-Yen Nguyen, Senior Fellow at Soitec, Austin, TX  
Sonia Leon, Principal Engineer at Intel, Santa Clara, CA

#### **Careers in Academia**

Terri Fiez, Professor & Vice Chancellor of Research at University of Colorado Boulder, Boulder, CO  
Milin Zhang, Assistant Professor, Tsinghua University, Beijing, China

#### **Entrepreneurship**

Christine Ho, Co-Founder of Imprint Energy, Alameda, CA  
Teresa H. Meng, Founder of Atheros Communications, Palo Alto

## EE3: Industry Showcase



**Organizers:** **Alison Burdett**, *Sensium Healthcare, Abingdon, Oxfordshire, United Kingdom*  
**Eugenio Cantatore**, *Eindhoven University of Technology, Eindhoven, The Netherlands*  
**Kush Gulati**, *Omni Design Tech., Milpitas, CA*  
**Yan Li**, *Western Digital, Milpitas, CA*

### COMMITTEE MEMBERS

|                 |                                     |
|-----------------|-------------------------------------|
| Shuichi Nagai   | Panasonic, Osaka, Japan             |
| Long Yan        | Samsung, Hwasong-si, Korea          |
| Abbas Komijani  | Apple, Cupertino, CA                |
| Roberto Nonis   | Infineon, Villach, Austria          |
| Alan Wong       | EnSilica, Abingdon, UK              |
| David McLaurin  | Analog Devices, Raleigh, NC         |
| John Maneatis   | TrueCircuits, Los Altos, CA         |
| Calvin Chao     | TSMC, Hsinchu City, Taiwan          |
| Tim Piessens    | icSense, Leuven, Belgium            |
| Vadim Ivanov    | Texas Instruments, Tucson, AZ       |
| Jan Westra      | Broadcom, Bunnik, The Netherlands   |
| Yung-Shiang Shu | MediaTek, Hsinchu City, Taiwan      |
| Stephane LeTual | STmicroelectronics, Crolles, France |
| Yogesh Ramadas  | TI, San Jose, CA                    |

This year at ISSCC, on the 65th anniversary of the conference, a new event called the Industry Showcase will be introduced for the first time. Following the recognized role of ISSCC as the foremost global forum for advances in solid-state circuits and systems-on-chip (SoCs), the goal of this event is to highlight the role of silicon in the creation of novel products. It will feature short presentations as well as interactive demonstrations where attendees can have a hands-on experience with each featured innovation. The presentations were chosen through a nomination and voting process by members of the Industry Showcase Committee, and represent an exciting introduction to the next generation of applications and products enabled by the sustained evolution of solid-state integrated circuits.

### Industry Showcase Participants:

#### Infineon Technologies, Germany

60GHz Antenna-in-Package Radar for Sensing applications

Presenter: Jagjit Singh Bal

BGT60TR24 is a 2TX-4RX V-Band Radar with antennas in package. It is manufactured in Infineon qualified B11HFC BiCMOS process. The wide bandwidth, low jitter FMCW modulation enables high range resolution and high accuracy object detection in 3-D space. The sensor provides digitized data over SPI/QSPI interface. High integration and small form factor enable integration of such a sensor in space-constrained consumer devices.

Acknowledgements: Saverio Trotta, Reinhard Wolfgang Jungmaier, Ashutosh Baheti, Dennis Noppeney, Roberto Nonis.

#### Google, Mountain View, CA

Google Tensor Processing Units

Presenters: Norman P. Jouppi and Amir Salek

We present tradeoffs involved in the design of Google's first Tensor Processing Unit (TPU), a domain-specific accelerator for machine learning workloads. Our first TPU is capable of up to 92 TOPS for neural-network inference in a 28nm technology using DDR3 memory. Performance results from in-datacenter measurements running production applications are presented. We also give an overview of the capabilities of Google's second TPU, which accelerates both inference and training.

Acknowledgements: We'd like to thank the design teams of the first and second TPUs, including our chip, system, and software engineers as well as our management and support teams.

***Industry Showcase Participants (continued)*****Qualcomm Technologies, Inc., San Diego, CA**

Power efficient structured-light 3D depth sensing camera technologies for mobile devices

Presenters: Biay-Cheng Hsieh, James Nash, Sami Khawam, Nousias Ioannis, Mark Muir, Khoi Le, Kalin Atanassov, Sergio Goma

A 3D Structured Light Depth Sensing Camera System is introduced with fully integrated enhanced NIR response CMOS image sensor, Laser with Diffractive Optical Element (DOE) embedded in Wafer-Level-Optics (WLO) projector, and a programmable cell array ISP engine to process code detection for depth map & 3D point cloud generation.

Acknowledgements: Calvin Chao, & Dun-Nien Yau of TSMC; Amit Mittra, Tony Chiang, Hank Hsiao, & Sam Kuo of Himax Technologies.

**Sony Corporation, Tokyo, Japan**

Projection and Sensing Technology of Xperia Touch

Presenter: Kazumasa Kaneda

Xperia Touch that transforms your wall or table into an interactive touch-screen. Built with the latest Sony intelligence, it is a portable projector that's easy to use and remarkably smart. It includes Sony's unique SXRD Ultra short throw projection unit.

**Elliptic Labs, San Francisco, CA**

Ultrasound Virtual Sensors for Mobile, VR, and IoT

Presenter: Guenael Strutt, VP of Product Development at Elliptic Labs

Elliptic Labs technology expands human computer interaction through the use of ultrasound virtual sensor products. These software-only sensors enable touchless interactions in any device that contain a speaker and a microphone, allowing, for example, a smartphone to recognize a natural hand gesture, or a voice assistant to recognize movement or presence in a home. Elliptic Labs' products bring the magic of ultrasound all the way from the raw microphone signal to the phone apps, so that developers do not have to perform any data analysis or interpretation, but instead can focus on creating great experiences.

**NVIDIA, Santa Clara, CA**

NVDLA, an Open Source IP for AI Inferencing

Presenter: Frans Sijstermans

Our open source Deep Learning Accelerator, NVDLA, provides a hardware and software solution for inferencing "at the edge". While most deep learning is still done in the cloud, there are many reasons like latency, connectivity, and network bandwidth to move more inference to edge devices. Such devices are typically more cost and power sensitive than cloud computers, whereas some compromises on flexibility are acceptable. The NVDLA fixed function inferencing module was originally developed for use in the NVIDIA self-driving car platform. To accelerate adoption of deep learning, we decided to open source the design. NVDLA will fit seamlessly in NVIDIA's deep learning platform and will also support all deep learning frameworks. We are continuing to make enhancements and we are encouraging contributions from the community.

**Ultrahaptics, Bristol, UK**

Mid-air haptic feedback through modulated ultrasound

Presenter: Tom Carter

Ultrahaptics will present its mid-air haptic feedback system that uses modulated ultrasound to create the sensation of touch in mid-air. The system enables users to feel virtual objects, switches, buttons and sliders, without being required to hold controllers, or wear gloves: the sensation tracks their hand in mid-air. Currently deployed in multiple applications, from automotive controls to augmented reality immersive experiences, the technology will be available at the demonstration for people to experience, and feel, for themselves.

**Novelda AS, Oslo, Norway**

Pulse-based radar for presence detection

Presenter: Dag T. Wisland

Most commercially available occupancy sensors are based on passive infrared radiation sensors capable of detecting minor movements, but lack the ability to detect the presence of stationary subjects. Pulse-based radars operating across large bandwidths are capable of sensing human presence based on vital signs detection and offer unambiguous distance measurements. This demo will showcase a pulse-based radar SoC applied as a presence detection sensor. Different signal processing schemes suitable for different end-user applications will be demonstrated, showing the versatility of the system.

**CHRONOCAM, Paris, France**

Frame-free vision system for high-speed low-power real-time machine vision

Presenters: Geoffrey Burns, Christoph Posch

This demo showcases a compact low-power vision system for high speed real-time machine vision applications. The system achieves new tradeoffs in temporal resolution, data rate and computational complexity by using image sensors based upon pixels that individually auto-sample visual information. Unlike conventional image sensors running on a fixed frame rate, the output data rate of such sensor is independent of its acquisition speed. With this approach, visual data acquisition simultaneously becomes fast and sparse, enabling to combine high-speed - kiloframes per second equivalent - acquisition of fast transient processes with real-time visual data processing in a compact and low-power system.

## EE4: Figures-of-Merit on Trial



### Organizers:

**Kostas Doris**, NXP, Eindhoven, The Netherlands

**Stefano Stanzione**, imec-NL, Eindhoven, The Netherlands

**Paul Ferguson**, Analog Devices, Wilmington, MA

Mixed-signal/RF circuits are characterized by a wide variety of performance parameters and diverse functionality. A figure of merit (FOM) provides a unique, simple and objective metric that allows normalizing and comparing circuits and systems of the same class. On the other hand, does the minimalistic simplicity of any single metric sacrifice more than it offers? Doesn't engineering practice intrinsically require designing and judging a far more complex reality than the monochromatic reductionism that an FOM can provide? For instance, in the case of analog-to-digital converters, the ability to drive the ADC's input, to clock it, to integrate it or interface it with other processing units, to supply power to it, are just a few real-life examples of factors that can make or break a converter architecture and the signal chain embedding it. These factors are not considered in any FOM, with potentially catastrophic consequences.

Enough already with the cult of FOMs? Open the doors to a new age of purely human subjective calls? You, the audience, be the judge.

This panel will probe the weaknesses and strengths of popular analog FOMs in an entertaining and educational way: To this end, the room will become a tribunal with the moderator as judge. For each FOM on trial, two panelists will officiate, one becoming the defending advocate of the FOM, and the other the prosecutor, while the audience will become the jury, that will decide which of the two contestants will win.

### Abstracts



#### **Moderator: Gabriele Manganaro, Analog Devices, Wilmington, MA**

Gabriele Manganaro (S'95, M'98, SM'03, F'16) holds a Dr.Eng. and a Ph.D. degree in Electronics from the University of Catania, Italy. Starting in 1994, he did research with ST Microelectronics and at Texas A&M University. He worked in data converter IC design at Texas Instruments, Engim Inc, and as Design Director at National Semiconductor. Since 2010 he has been Engineering Director for High-Speed Converters at Analog Devices. He served on the ISSCC technical subcommittee for Data Converters for seven consecutive years. He was Associate Editor for the IEEE Transactions On Circuits and Systems - Part II and then Associate Editor, Deputy Editor in Chief and finally Editor in Chief for the IEEE Transactions On Circuits and Systems - Part I. He has authored/co-authored more than 60 papers, three books (notably "Advanced Data Converters", Cambridge University Press, 2011) and has been granted 15 US patents, with more pending. He was recipient of scientific awards, including the 1995 CEU Award from the Rutherford Appleton Laboratory (UK), the 1999 IEEE Circuits and Systems Outstanding Young Author Award and the 2007 IEEE European Solid-State Circuits Conference Best Paper Award. He is an IEEE Fellow (since 2016), a Fellow of the IET (since 2009), Member of Sigma Xi, and a member of the Board of Governors for the IEEE Circuits and Systems Society (2016-2018).



#### **Filip Tavernier, KU Leuven, Leuven, Belgium**

FOMs are dangerous creatures and need to be treated with care! You can prove almost anything with them if they are not used in their proper context. Almost all the commonly used FOMs in the power management domain suffer from a seriously restricted validity across the whole working region. For example, although the peak efficiency is clearly an important metric in all applications, it suffers from a significant skew: two converters with 98% and 99% efficiency have a 50% difference in loss, while two converters with 80% and 81% efficiency have losses that are almost equal. Moreover, some FOMs, like power density, lose a lot of their significance once external components are used. A last example is output ripple. Simply characterizing ripple as a peak voltage in the time domain is only important for digital loads, while typical analog applications depend more on the frequency characteristic of this ripple. Unfortunately, such remarks can (and need to) be made about any FOM in this domain. Therefore, depending on the specific application, the technology used and some other requirements, a set of relevant FOMs should be selected, optimized for and used to compare with the state-of-the-art.

**Filip Tavernier** (M'05) obtained the Master's degree in Electrical Engineering (ir.) and the PhD degree in Engineering Science (dr.) from the KU Leuven, Leuven, Belgium, in 2005 and 2011 respectively.

During 2011-2014, he was a Senior Fellow in the microelectronics group at the European Organization for Nuclear Research (CERN) in Geneva, Switzerland. During 2014-2015, he was a Postdoctoral Researcher at the Department of Electrical Engineering (ESAT-MICAS) of the KU Leuven. Since October 2015, he is an Assistant Professor at the KU Leuven within the same department. His main research interests include circuits for optical communication, data converters, power converters and chips for radiation environments.

Prof. Tavernier is treasurer of the IEEE SSCS Benelux Chapter, member of the technical program committee of ESSCIRC, ESSCIRC 2017 tutorial chair and SSCS webinar coordinator for Europe.



#### **Bob Dobkin, Analog Devices, Milpitas, CA**

It almost seems that new developments - either process or circuit - result in a paper with a new Figure of Merit. These FOM's magically make the new developments the "best". It is interesting to analyze the contortions some authors resort to in generating the FOM. What we need is a Figure of Merit of the FOM so we can compare FOMs. While there is no current FOM for FOMs, it can be generated using known data and techniques.

**Bob Dobkin** was a founder and Chief Technical Officer of Linear Technology Corporation. Prior to 1999, he was responsible for all new product development at Linear. Before founding Linear Technology in 1981, Mr. Dobkin was Director of Advanced Circuit Development at National Semiconductor for eleven years. He has been intimately involved in the development of high performance linear integrated circuits for over 40 years and has generated many industry standard circuits. Mr. Dobkin holds over 100 patents pertaining to linear ICs and has authored over 50 articles and papers. He attended the Massachusetts Institute of Technology. Linear was Purchased in 2017 by Analog Devices. Mr. Dobkin assumed the position of CTO emeritus.

**Anton de Graauw, NXP Semiconductors, Eindhoven, The Netherlands**

Classical FOM's like  $f_t$ ,  $f_{max}$ ,  $V_{BD}$  and  $NF_{MIN}$  are commonly used to judge the circuit performance in various semiconductor technologies. It can be argued, however, how suitable these FOM's are for describing the performance of mm-wave circuits because technology characteristics like the quality factor and tolerances of passive components and their coupling to the substrate are typically not considered while they are very relevant for many key functions such as impedance matching, power combining and filtering.

It is therefore worthwhile to consider extended FOM's that capture all key technology characteristics that determine the mm-wave circuit performance. Defining a set of extended FOM's is however quite a challenge and raises many topics for debate. For example, which technology characteristics should be included and how do they link to the circuit performance, and can we still use simple analytical expressions or do we need a different approach?

**Anton de Graauw** was born in Delft, The Netherlands, in 1966. He received his Master of Science degree in Electrical Engineering from the Technical University of Delft in 1993. Anton worked in several R&D positions for N.K.F. Telecom, Philips Components, Philips Semiconductors and NXP in the areas of fiber-optic CATV systems, RF and mm-wave communication and radar transceiver chips and antenna modules. He currently works as an IC Architect on Car-Radar transceivers and integrated antennas at NXP in Eindhoven, The Netherlands.

**Jason T. Stauth, Thayer School of Engineering at Dartmouth Hanover, NH**

What is the key to an accepted ISSCC paper? A brand new FOM that lifts your work to a position of unrivaled genius – never to be seen again! In power management, there are the obvious metrics – efficiency, cost, size, etc. But, there have been discussions (controversies?) related to density metrics: power or current per unit volume or area. There are also interesting ways to compare topologies using active and passive device utilization metrics such as  $V \times A$  product (summed across switches) or  $C \times V^2$  (summed across capacitors). However, power electronics covers an incredibly diverse application space. There are a variety of application-specific considerations (switching frequency, spectral content, transient performance, nonlinearities) that limit the effectiveness of simple scaling models. Can we unify these into a simple, straightforward FOM? We will see.

**Jason Stauth** received his M.S. and PhD. degrees from the University of California, Berkeley in 2006 and 2008 respectively, where he studied high-frequency power electronics and RF power amplifiers. He is currently Associate Professor at the Thayer School of Engineering, Dartmouth College, Hanover, NH. His research interests include high-density power electronics and integrated circuits for applications in renewable energy, transportation, and energy storage. Dr. Stauth serves as an Associate Editor for the IEEE Transactions on Power Electronics and IEEE Journal of Emerging and Selected Topics in Power Electronics. He is a recipient of the National Science Foundation CAREER Award.

**Lawrence Loh, MediaTek, City, State, Country**

In IC design communities, Figure of Merits (FOMs) are usually created and adopted in an attempt to replace multiple lines of specifications with single/simpler index(s). Unfortunately, there seem to be no golden or commonly recognized proper ways to weight or normalize specs properly. Most popular FOMs, such as those commonly used for evaluating A/D converters and SerDes, usually employ a performance index (e.g. speed or conversion rate) normalized by the power consumption. Nevertheless many other considerations such as die areas/costs, bill of materials, design difficulties, re-usability are just a few of many important practical factors worthwhile to be taken into account by IC designers. In this evening session the panelist would like to share his distinctive views on how an IC design company with mostly big digital SOC platforms should benchmark various building blocks and subsystems to ensure its competitiveness in the industry.

**Lawrence Loh** is a Corporate Senior Vice President of MediaTek Inc. He oversees the company's Central Engineering Group, responsible for engineering the company's SOCs and chipsets design, development and implementation activities for all MediaTek's product lines including mobile communication, application processors, wireless connectivity, IOT, automotive, home entertainment, optical storage and broadband/networking business. He is also serving as President of MediaTek USA, Inc., responsible for the company's global operations in Europe and America.

Dr. Loh started his first circuit design position at IMP and later he joined Cirrus Logic, where his last position was Director of Analog IC Engineering. In 1998, he founded Silicon Bridge Inc., where he successfully led a number of analog/mixed-signal IC development projects with major semiconductor companies including MediaTek and Altera Corporation. Before joining MediaTek in 2004, he contributed to the IC design industry in areas of read/write channels for magnetic and optical storage, high-performance analog filters, solid-state fingerprint sensors, high-speed SERDES and wireline transceivers for various business applications. He received his Ph.D. degree in Electrical Engineering from Texas A&M University, College Station, Texas. He has authored/co-authored dozens of technical papers/patents in areas of analog and mixed-signal integrated circuits/systems design and has contributed many panel talks and invited keynote speeches at numerous international conferences and professional communities. He served on the ISSCC International Technical Program Committee for 5 consecutive years since 2005. He is currently serving on the Steering Committee of A-SSCC and also on the Board of Directors for Global Semiconductor Alliance (GSA).

**Pietro Andreani, Lund University, Lund, Sweden**

Are FOMs purely academic constructs? Indeed they are. Do PhD students "design for FOM"? Unfortunately, it happens all too often. But then, are FOMs useful? Yes, of course, if treated with care. For an IC designer, an FOM is a most valuable tool to gauge the discrepancy between ideal and real performance – whether this discrepancy is reasonable is, however, often tricky to assess. FOMs are much less effective when comparing designs with different specifications – nevertheless, they are used massively in this way, too, as they make life so much easier for extremely busy (or lazy) reviewers. Finally, we should strongly discourage FOM tweaking, where minor and possibly irrelevant design features are arbitrarily incorporated in the (new) FOM, with the sole goal of making it appear more competitive.

**Pietro Andreani** received the M.S.E.E. degree from the University of Pisa, Italy, in 1988, and the Ph.D. degree from Lund University, Sweden, in 1999. Between 2001 and 2007 he was chair professor at the Center for Physical Electronics, Technical University of Denmark. From 2005 to 2014 he had a 20% position as analog/RF designer at Ericsson AB in Lund, Sweden. Since 2007, he has been Associate Professor at the dept. of Electrical and Information Technology (EIT), Lund University, working in analog/mixed-mode/RF IC design. He is also the head of the VINNOVA Center for System Design on Silicon, hosted by EIT. He has been a TPC member of ISSCC (2007-2012), is a TPC member of ESSCIRC (chair of the Frequency Generation subcommittee since 2012, TPC chair in 2014) and RFIC, and Associate Editor of JSSC. He has been an IEEE SSCS Distinguished Lecturer since 2017. He has authored numerous papers on harmonic oscillators and phase noise.

## EE5: Lessons Learned – Great Circuits That Didn't Work – (Oops, If Only I Had Known!)



### Organizers:

**Phillip Restle**, IBM T. J. Watson Research Center,  
Yorktown Heights, NY

**Kostas Doris**, NXP, Eindhoven, The Netherlands

**Vivek De**, Intel, Hillsboro, Oregon

**Paul Ferguson**, Analog Devices, Wilmington, MA

Working on your first (or last) IC can be exciting, stressful, rewarding, and embarrassing. Whatever the lesson learned, be assured that it was experienced by pioneers before you. Failures (mistakes or just bad ideas!) can be valuable learning experiences, but are rarely revealed. Tonight, we provide an opportunity for recognized experts to share their past mistakes and failures, and disclose lessons learned. After the panelists have confessed, the audience can also contribute "learning experiences" (in less than a minute). Inevitably, this collection of revelations will be motivating: inspiring to the young and inexperienced; and virtuous for gurus in sharing a universal truth – first-time perfection is rare!

### Abstracts



#### Moderator: Tom Lee, Stanford University, Stanford, CA

Stuff happens. We all know it; many of us have contributed to it (some more than others). The common impulse is to use an industrial-grade Neuralyzer(tm) so that we can all pretend otherwise. But since we arguably learn more from failure than from success, an immoderate approach to moderation will be kept at the ready, to overcome a natural reluctance to admit failure, if needed. Implantable electrodes and pharmaceuticals will be available to handle the extreme cases.

**Thomas H. Lee** is an electrical engineering professor at Stanford University. In 1994 he founded the Stanford Microwave Integrated Circuits Laboratory. He has written and co-authored several books and papers, and recently concluded a tour of duty as the director of DARPA's Microsystems Technology Office.



#### Bram Nauta, University of Twente, Enschede, The Netherlands

If you are a software engineer you can press the compile button many times a day. As an analog IC designer, you may do a "tape out" only a few times per year. Working without direct verification may feel frustrating, but it is also the reason why we still have jobs. Especially experienced engineers have a very high value in overseeing complex designs while their knowledge is built on their mistakes of the past.

Back then ICs were much simpler, but models and simulators at that time were also more limited. As a result, analog designs kept revealing unexpected results after evaluation of our silicon. Our tools may improve further, but the complexity of our designs will grow accordingly. Learning from mistakes is essential to becoming a good analog designer, but of course it is even better to learn from mistakes made by others, which is the goal of this panel.

In 1987 **Bram Nauta** received the MSc degree in electrical engineering from the University of Twente, Enschede, the Netherlands. In 1991 he received the PhD degree from the same university and joined Philips Research in Eindhoven, the Netherlands. In 1998 he returned to the University of Twente, where he is currently a distinguished professor, heading the IC Design group. He served as the editor-in-chief (from 2007 to 2010) of the IEEE Journal of Solid-State Circuits and was the 2013 program chair of the International Solid-State Circuits Conference (ISSCC). He is currently the president-elect of the IEEE Solid-State Circuits Society.



#### Nicky Lu, Etron Technology, Hsinchu, Taiwan

Engineers like to invent new circuits and only innovations create product differentiation. From my 40 years in IC design as a Ph.D student (Stanford), a researcher/product-engineer (IBM) to an entrepreneur fully relying on IC innovations to make startups successful (Etron, GUC, eYs3D), I have learned harshly that success of an innovative circuit requires it being tested through large volume production and long product lifecycles. Failure experiences are so painful but useful and valuable for reducing failures in the current or next products. Some learnings on circuit innovation/execution with experience of shipping multiple billion pieces of memory and SOC products, especially mixed with complicated problems factored in with silicon technologies, packaging, testing, and application field usages, will be shared: sometimes so easy to design but later so hard to fix; so quick to design but later so long to correct; also so easy to give up innovations which later are breakthroughs after years, etc.

As a researcher, design architect, entrepreneur and chief executive, **Nicky Lu** has dedicated his career to the worldwide IC design and semiconductor industry. He is Chairman, CEO and Founder of Etron Technology and co-founded several other high-tech companies including Ardentec and Global Unichip Corp.

Dr. Lu worked for the IBM Research Division and then the Headquarters from 1982 to 1990 and won numerous IBM recognition awards, including an IBM Corporate Award. He co-invented and pioneered a 3D-DRAM technology, known as the Substrate-Plate Trench-Capacitor (SPT) cell, along with its associated array architecture, which has been widely used by IBM and its licensees in 1Mb to 1Gb DRAMs and embedded DRAMs. Dr. Lu designed a High Speed CMOS DRAM (HSDRAM) chip in 1984, 3x faster than normal DRAMs, the concept of which becomes core technologies of many major DRAMs. He was elected as an AdCom member of the IEEE Solid-State Circuits Society from 1977 to 1999, and he is on the technical program committees of the IEEE International Solid States Circuits Conference (ISSCC) from 1988 to 2002 and of the Symposium on VLSI circuits since 1990. He is an IEEE Fellow, the recipient of the IEEE 1998 Solid-States Circuits Award, and a member of National Academy of Engineering of USA.

As a co-architect leading the 8-inch wafer and DRAM/SRAM/LOGIC technology developments for Taiwan's semiconductor industry in early 1990s, which later creates many Taiwan companies as prominent silicon chip suppliers, Dr. Lu was thus awarded the Medal of Excellence in Science and Technology by the Premier of the Republic of China. Since 1999 he has pioneered DRAM Known-Good-Die Memory products enabling customers' 3D stacked-die system chips. This work summoned the new rise of an IC Heterogeneous Integration Era as described in his plenary talk at the 2004 ISSCC, demonstrating a new 3D IC trend in parallel to the Moore's Law.

Dr. Lu received his B.S. in Electrical Engineering from National Taiwan University and M.S. and Ph.D. in EE from Stanford University. He holds over 24 U.S. patents and has published more than 50 technical papers. He serves as Chairman of TSIA (Taiwan Semiconductor Industry Association) and was Chairman of Global Semiconductor Alliance (GSA, the former FSA) from 2009 to 2011. He is an Outstanding Alumnus of National Taiwan University and a Chair Professor and an Outstanding Alumnus of National Chiao Tung University.



**Shanthi Pavan, Institute of Technology, Madras, Chennai, India**

Mea Culpa! The two words you do not want to be saying when you are characterizing your chips on a lab bench. Design tools have become complex, but so have our circuits – and uncaught errors do see silicon. Rather than be tickled, it might be better to be tickled at these (rather expensive) aha moments. At least you and your colleagues will never make the same mistake again! I will present some examples from the dainty world of delta-sigma ...

**Yendluri Shanthi Pavan** (born 1973) is an Indian electrical engineer and a professor at the Department of Electrical Engineering of the Indian Institute of Technology, Madras. He is known for his studies on mixed signal VLSI circuits and is an elected fellow of the Indian National Academy of Engineering. The Council of Scientific and Industrial Research, the apex agency of the Government of India for scientific research, awarded him the Shanti Swarup Bhatnagar Prize for Science and Technology, one of the highest Indian science awards for his contributions to Engineering Sciences in 2012.



**David J. Allstot, University of California, Berkeley, CA**

It took the IC community nearly a century to rediscover the switched-capacitor resistor concept described in 1873 by the great genius, James C. Maxwell. The emergence of MOS technologies, especially CMOS in the early 1980s, combined with the “killer app”—conversion from analog-to-digital switching telephony—fueled an explosion of new circuit techniques, i.e. switched-capacitor (SC) circuits. The SC SAR ADCs and SC filters worked beautifully with the early MOS opamps and comparators, perhaps because they mainly exploited just the basic SC concept.

As the demands for greater speed and accuracy have grown, many clever circuit techniques have evolved that use switches, capacitors, and precision clocks; e.g., offset cancellation, multi-stage clocked high-gain amplifiers and comparators, bootstrapped switch drivers, etc. The use of SC techniques to enhance analog performance led to a new generation of design errors. Some of these errors may revisit us in modern low-voltage designs.

**David J. Allstot** received the B.S. (1969), M.S. (1974), and Ph.D. (1979) degrees from the Univ. of Portland, Oregon State Univ. and the Univ. of California at Berkeley, respectively. He has held several industrial and academic positions. He was a Professor of Electrical Engineering at the University of Washington from 1999 to 2012. In 2000, he was appointed as the Boeing-Egtvedt Chair Professor of Engineering. He served as Acting Chair and Chair of Electrical Engineering from 2004 to 2007. He has advised more than 100 M.S. and Ph.D. graduates. He served as Editor of the IEEE Transactions on Circuits and Systems, General Co-Chair of the 2002 and 2008 IEEE International Symposium on Circuits and Systems, and as the 2009 President of the IEEE Circuits and Systems Society. He is a Fellow of IEEE.



**Barrie Gilbert, Analog Devices NW Labs, Beaverton, OR**

Errare humanum est. In my experience as a promoter and designer of a lot of ICs I've seen mistakes happen. In the early days of monolithic analog design – that is, the mid-1960s – they happened at many levels of the development cycle. Back then, most could be blamed on the general unfamiliarity of designing in this novel medium. There were unwanted interactions between elements via the common substrate (loose electrons and thermals come to mind) and even the mysterious appearance of white light in the microscope's eye...

Some fifty years later, we're still making mistakes, but of a different quality. So much is now regimented that from a procedural viewpoint, the mistakes that young analog designers fall prey to are often due to a lack of intimate knowledge of how to manage the ego of the mischievous transistor. Expect both old and new oversights to be discussed by the panel.

**Barrie Gilbert** is a Life Fellow of the IEEE, ADI Fellow, and a Member of the National Academy of Engineering. Born in 1937, in Bournemouth, England, he pursued an early interest in the then-new “transistor” at Mullard Ltd, later working on first-generation planar ICs. Emigrating to the US in 1964, he joined Tektronix, in Beaverton, OR, where he developed the first electronic “knob-readout” system, and other advances in instrumentation. Between 1970-1972, back in England, he was Group Leader at Plessey Research Laboratories. In 1972 he worked as an IC designer for Analog Devices Inc. and joined that company full-time in 1979, as their first Fellow. He now directs the development of high-performance analog ICs at the NW Labs in Beaverton, which he founded.

For work on merged logic (later called I2L) he received the IEEE “Outstanding Achievement Award” (1970); and in 1986 the IEEE Solid-State Circuits Council “Outstanding Development Award”, citing his earlier invention of the Translinear Technique. He was Oregon Researcher of the Year in 1990, and in 1992 received the Solid-State Circuits Council Award for “Contributions to Nonlinear Signal Processing”. He has received ISSCC “Outstanding Paper” awards five times, the ESSCIRC “Best Paper” award twice, and several industry awards for “Best Product”, etc. He has written extensively about analog design and is a frequent lecturer. He has been issued over 100 patents worldwide and holds an Honorary Doctorate of Engineering from Oregon State University. He is a Member of the National Academy of Engineering.



**Jon Strange, MediaTek Wireless, West Malling, United Kingdom**

Why we sometimes wish we had peeled the onion a little more. Modern circuit simulators, transistor models, and associated design tools have largely eliminated many of the “schoolboy errors” that were common at the start of my career. These tools whilst transformative do not always provide the detailed insights for the highest performance designs. Circuit topologies that at first appear “simple” can reveal very complex behavior particularly when non-linear effects are considered. Sometimes these insights can occur after the silicon is sitting in your lab or even worse at your customers. In this panel we will explore whether the most fertile source of insight might be hindsight. Caveat Emptor.

**Jon Strange** received the BSc and MSc degrees from Durham and Edinburgh Universities respectively. In 1991, he cofounded Mosaic Micro Systems acquired by Analog Devices in 1996. Until 2008 he was Engineering Director at Analog Devices. Currently he is Senior Director and Fellow at MediaTek responsible for developing cellular transceivers and related technologies for mobile platforms. He has led product development for over a dozen commercially launched RF IC products for the cellular market with cumulative shipments well in excess of 2B units. He is the recipient of 15 granted patents.

## EE6: Can Artificial Intelligence Replace My Job? The Dawn of a New IC Industry with AI



**Organizers:** **Jaeha Kim**, *Seoul National University, Seoul, Korea*  
**Ki-Tae Park**, *Samsung Electronics, Gyeonggi-do, Korea*

The emergence of artificial intelligence (AI) capable of human tasks and more and better, is approaching fast. Shortly, most businesses, including the IC industry, will choose AI over humans, if AI can deliver the same results with lower risks and costs. Consequently, many questions arise for us: what will be the respective roles of AI and humans in developing ICs? How will AI shape the IC industry? What is the right career choice for young people in the field? This panel will showcase diverse experts who will share their vision on this daunting new development in our business.

### Abstracts



**Moderator: Paul D. Franzon, North Carolina State University, Raleigh, NC**

Machine learning is definitely coming to chip design but the question is how much impact will it have? The recent success stories have largely been to use "big data" to produce models to help speed up specific steps. That is valuable but can these big data driven machine learning techniques lead to bigger success stories? That is a central question in this panel. Is there potential to create break-through levels of automation with machine learning or its close cousin machine intelligence? Is there potential to create useful new flows that don't require massive data to drive them? Is there potential to change the overall design paradigm, or, to entirely skip difficult steps such as layout and layout verification? This panel will address these and other issues.

**Paul D. Franzon** is currently a Professor of Electrical and Computer Engineering at North Carolina State University. He earned his Ph.D. from the University of Adelaide, Adelaide, Australia in 1988. He has also worked at AT&T Bell Laboratories, DSTO Australia, Australia Telecom and two companies he cofounded, Communica and LightSpin Technologies. His current interests center on the technology and design of complex systems incorporating VLSI, MEMS, advanced packaging and nano-electronics. He has lead several major efforts and published over 200 papers in these areas. In 1993 he received an NSF Young Investigators Award, in 2001 was selected to join the NCSU Academy of Outstanding Teachers, in 2003, selected as a Distinguished Alumni Professor, and in 2005 won the Alcoa award. He is a Fellow of the IEEE.



**Bill Dally, NV/DIA, Santa Clara, CA**

Machine learning is revolutionizing all aspects of human life and has already achieved super-human capability on tasks ranging from image classification to speech recognition to playing the game of "Go". This capability will be applied to design tasks, automating many tasks and streamlining the design process. AI-enabled EDA tools, like previous EDA tools, will free human designers to work at a higher level - focusing on what systems and chips to design and on high-level organization, rather than detailed design tasks. There will be more jobs developing applications and curating data for training and fewer jobs building low-level hardware. The data to train these EDA tools will come from the design archives of large semiconductor companies. Executives - responsible for profit and loss - will choose to buy these tools - even if they do replace circuit designers, much as they chose to buy logic synthesis tools when they became available. People, not AI, will be the managers - deciding what gets done and setting priorities - while the AI takes care of the low-level details.

**Bill Dally** joined NVIDIA in January 2009 as chief scientist, after spending 12 years at Stanford University, where he was chairman of the computer science department. Dally and his Stanford team developed the system architecture, network architecture, signaling, routing and synchronization technology that is found in most large parallel computers today. Dally was previously at the Massachusetts Institute of Technology from 1986 to 1997, where he and his team built the J-Machine and the M-Machine, experimental parallel computer systems that pioneered the separation of mechanism from programming models and demonstrated very low overhead synchronization and communication mechanisms. From 1983 to 1986, he was at California Institute of Technology (CalTech), where he designed the MOSSIM Simulation Engine and the Torus Routing chip, which pioneered "wormhole" routing and virtual-channel flow control. He is a member of the National Academy of Engineering, a Fellow of the American Academy of Arts & Sciences, a Fellow of the IEEE and the ACM, and has received the ACM Eckert-Mauchly Award, the IEEE Seymour Cray Award, and the ACM Maurice Wilkes award. He has published over 250 papers, holds over 120 issued patents, and is an author of four textbooks. Dally received a bachelor's degree in Electrical Engineering from Virginia Tech, a master's in Electrical Engineering from Stanford University and a Ph.D. in Computer Science from CalTech. He was a cofounder of Velio Communications and Stream Processors.



**Georges G.E. Gielen, Katholieke Universiteit Leuven, Leuven, Belgium**

Despite the tremendous progress in algorithm development (especially optimization techniques) over the years, several analog and mixed-signal IC design tasks still largely remain being handcrafted under direct human control. This includes initial sizing and layout generation of analog circuits. The main reasons for this likely include large CPU times needed and difficulty to explicitly come up with proper constraints to steer the tools towards acceptable results. Techniques from deep machine learning offer the perspective to radically change this and to have tools automatically discover and learn from the decisions taken by designers. The data set problem can be addressed by applying machine learning at different meta-levels. As such, the tools will take over the routine jobs of designers, who can focus on creative design steps. In addition, by conceiving designs in a more flexible way, deep learning can also be applied at system level after chip fabrication, to fully adapt ICs to their application purpose.

**Georges G.E. Gielen** received the M.Sc. and Ph.D. degrees in Electrical Engineering from the Katholieke Universiteit Leuven (KU Leuven), Belgium, in 1986 and 1990, respectively. He is full professor at the Department of Electrical Engineering (ESAT). From August 2013 till July 2017 Georges Gielen was also appointed as vice-rector for the Group Science, Engineering and Technology and responsible for academic HRM at KU Leuven. His research interests are in the design of analog and mixed-signal integrated circuits, and especially in analog and mixed-signal CAD tools and design automation. He is a frequent invited speaker/lecturer and coordinator/partner of several (industrial) research projects in this area, including several European projects. He has (co-)authored 7 books and more than 450 papers in edited books, international journals and conference proceedings. He is Fellow of the IEEE since 2002.

**Dario Gil, IBM Research, Yorktown Heights, NY**

AI systems have recently become pervasive and we have already seen the application of AI in augmenting human work and decision making in the transportation, healthcare, and financial services industries. We are witnessing the creation of real technologies that will touch every industry over the next decade. There will continue to be a significant need for expertise in developing intelligent systems and the application of AI to address complex tasks. For the IC industry in particular, I expect the partnership of humans and AI to enable increased design automation and further advances in the manufacturing process. This industry will also be a driving force in advancing new computing hardware optimized for AI. The GPU is only the beginning. Startups, venture capitalists, and corporations are all placing bets on their own approaches to hardware advancements to enable improvements in computing efficiency, which creates a field ripe for expertise and innovation.

**Dario Gil** is a leading technologist and senior executive at IBM. As Vice President of AI and IBM Q, Dr. Gil is responsible for IBM's artificial intelligence research efforts and for IBM's commercial quantum computing program (IBM Q). Prior to his current position Dr. Gil was the VP of Science and Solutions, directing a global organization of 1,500 researchers across 12 laboratories with a broad portfolio of activities spanning the physical sciences, the mathematical sciences, and industry solutions based on AI, IoT, Blockchain and Quantum technologies. His research results have appeared in over 20 international journals and conferences and he is the author of numerous patents. Dr. Gil is an elected member of the IBM Academy of Technology. He received his Ph.D. in Electrical Engineering and Computer Science from MIT.

**Antun Domic, Synopsys, Mountain View, CA**

AI will not "replace my job" in the IC industry during the next decade, but it will change several aspects of it. There is an interesting contrast. The count of new IC designs in a year is below 10,000 so the amount of data is relatively small, even neglecting the data access issue. Going to the other extreme, each of the mask sets contains billions of polygons. This wide range gives us clues regarding places where AI techniques involving big data analysis and machine learning will be successfully applied. In the EDA domain, dominated today by algorithms that include large collections of heuristics, there will be replacements of many specific, but narrow steps by different approaches that will require significant "tuning" using data. As in the past, these improved programs will make it possible for the design community to continue producing increasingly complex chips in the most advanced semiconductor technologies.

**Antun Domic** is Synopsys' Chief Technology Officer. As the company technical spokesperson, he focuses on aligning our advanced silicon roadmaps, driving our performance/low-power differentiation, and optimizing engineering execution across all business units. He previously served as Executive Vice President and General Manager of the Synopsys Design Group, for which he led the development of the company implementation and analog/mixed-signal product lines. Prior to joining Synopsys in 1997, Antun worked at Cadence Design Systems; at the Microprocessor Group of Digital Equipment Corporation in Hudson, Mass.; and at the Massachusetts Institute of Technology (MIT) Lincoln Laboratories in Lexington, Mass. Antun holds a B.S. from the University of Chile in Santiago and a Ph.D. in Mathematics from MIT.

**Seung Hoon Tong, Samsung Electronics, Gyeonggi-do, Korea**

The interest in AI is also increasing in the IC manufacturing businesses, especially in the areas of "fault detection and diagnosis" based on big data in backward direction, and "quality prediction for near future and proactive prevention" in forward direction. The use of AI will thrive based on the fast increasing amount of data volume in the IC manufacturing process, such as those from sensors, measurement instruments, testers, transaction events, etc. AI machines with high performance computing power are expected to replace humans doing simple, repetitive tasks, but it will still be challenging for AI to fully replace human's roles in the IC manufacturing areas where uncertainty exists. For example, the creative realm of designing work flows or special events call for human judgment. In the next several years, the best practice for AI in the IC manufacturing will take its form.

**Seung Hoon Tong** received the B.S. degree from Korea University in 1988, the M.S. degree from the Korea Advanced Institute of Science and Technology (KAIST) in 1991, and the Ph.D. degree from KAIST in 2006, under the sponsorship of Samsung Electronics, all in industrial engineering. In 1991, he joined Samsung Electronics, semiconductor business (1991~96 Micro, 1997~ Memory Biz ) where he has been engaged in the quality, reliability engineering and data mining area. Recently, his areas of interest are factory-wide quality integration systems and smart manufacturing based on data driven analytics and engineering statistics.

**Hsien-Hsin Sean Lee, TSMC, Hsinchu, Taiwan**

Machine learning (ML) will become a core technology across the computing stack from data analytics, software, platform, microarchitecture, down to circuits and devices. The key challenges for designers are how to exploit the maximum compute efficiency by orchestrating massive amounts of incoming data and computing resources under certain energy budgets. On the other hand, this new computing paradigm can also be applied to the process of such intelligent designs themselves. ML can circumvent legacy rule-based design techniques and unleash new opportunities in design optimization and manufacturing quality. For example, ML, with sufficient training design data, could assist place-and-route QoR in physical design, reducing iterative cycles of custom design signoff, or ensure early detection of corner cases in physical verification, providing new insight in yield learning, to name a few. The recent AI/ML resurgence will lead to profound impact in terms of design-to-market and end product quality for the entire electronics eco-system.

**Hsien-Hsin Sean Lee** is a Deputy Director of the Design Methodology & Kit Development Division at TSMC, leading design solution development for physical verification, extraction, reliability, PDK, and custom design. Prior to TSMC, he was a tenured Associate Professor at the School of Electrical and Computer Engineering, Georgia Tech, Atlanta and a senior CPU architect at Intel Corp. Dr. Lee holds a Ph.D. in Computer Science and Engineering from the University of Michigan. He received National Awards including the US NSF CAREER Award and the Department of Energy Early CAREER Award. He has published two book chapters and more than 100 technical articles including four Best Paper Awards. He served as an Associate Editor of IEEE Trans. on CAD, IEEE Trans. on Computers, ACM Trans. on Architecture and Code Optimization, and IEEE MICRO Magazine. He also served as a TPC member for more than 85 international conferences. Dr. Lee is an Industry Advisory Board Member of the IEEE Computer Society. He holds 19 US patents and is a Fellow of the IEEE.

# **SC: Hardware Approaches to Machine Learning and Inference**



**Organizer:** *Daniel Friedman, IBM Thomas J. Watson Research Center, Yorktown Heights, NY*

## **Introduction**

Advances in artificial intelligence are already changing how computing systems interact with users and interact with their environments, with further dramatic changes on the horizon. In this context, machine learning and inference operations have become a critically important computational workload, and the importance of this workload will continue to increase. Today, GPU-, CPU-, and FPGA-based engines dominate the compute landscape for learning and for inference, but the exploration of alternative, enhanced, or complementary compute capability in this problem space is an active and growing research area. In this short course, we will provide a framework for understanding some of the computational challenges in machine learning and inference and discuss emerging technical approaches aimed at meeting those challenges.

The first presentation in the course will provide an overview of machine learning and inference. It will start with a discussion of deep learning and machine learning, then will proceed to describe neural network approaches, model structures and layer types, and will use specific examples to clarify concepts. In the next talk, algorithm and implementation co-optimization for learning and inference will be discussed, including exploration of how applications can be used to drive design choices. In the third presentation, data flow approaches and energy considerations will be discussed in the context of machine learning and inference problems, including the application of the presented design principles to a specific accelerator implementation. Finally, the last presentation will discuss a different but related class of problems, such as continuous learning with limited data, and hardware approaches suited to solving such problems. Broadly, the four presentations will provide machine learning context, associated computation- and application-driven considerations, and a discussion of emerging approaches for machine learning and inference hardware implementation, the latter supported by specific illustrative design examples.

| <b>Time:</b>    | <b>Topic:</b>                                                                                                                             |
|-----------------|-------------------------------------------------------------------------------------------------------------------------------------------|
| 8:00 AM         | Breakfast                                                                                                                                 |
| 8:20 AM         | Introduction by Chair, <i>Daniel Friedman, IBM Thomas J. Watson Research Center, Yorktown Heights, NY</i>                                 |
| <b>8:30 AM</b>  | <b>Introduction to Machine Learning and Inference, <i>Gu-Yeon Wei, Harvard University, Cambridge, MA</i></b>                              |
| 10:00 AM        | Break                                                                                                                                     |
| <b>10:30 AM</b> | <b>Algorithm and Implementation Co-Design for Learning and Inference, <i>Marian Verhelst, KU Leuven, Heverlee, Belgium</i></b>            |
| 12:15 PM        | Lunch                                                                                                                                     |
| <b>1:20 PM</b>  | <b>Efficient Edge Solutions for Deep Learning Applications, <i>Vivienne Sze, Massachusetts Institute of Technology, Cambridge, MA</i></b> |
| 2:50 PM         | Break                                                                                                                                     |
| <b>3:20 PM</b>  | <b>Efficient Alternatives and Extensions to Deep-Learning-Based Solutions, <i>Naveen Verma, Princeton University, Princeton, NJ</i></b>   |
| 4:50 PM         | Conclusion                                                                                                                                |

## **OUTLINE**



### **8:30 AM SC1: Introduction to Machine Learning and Inference**

*Gu-Yeon Wei, Harvard University, Cambridge, MA*

Machine learning, and in particular deep learning, has received a great deal of attention in recent years as it has disrupted many fields of electrical engineering and computer science. The success of deep-learning techniques comes from their ability to solve notoriously difficult classification and regression problems. Much of deep learning's recent successes can be attributed to a virtuous cycle of advances in computing hardware, the availability of huge amounts of labeled data, and development of deeper models. This talk introduces the broad and dynamic field of deep learning for hardware designers. We begin with a brief history and review key innovations that have led to the powerful deep-learning techniques we see today. We will review the different types of learning widely used today with a focus on neural network models for inference applied across a variety of domains. The primary objective of this talk is to help and motivate chip designers to engage in this exciting opportunity and further push the impact of deep learning via hardware-level innovations.

**Gu-Yeon Wei** is Gordon McKay Professor of Electrical Engineering and Computer Science in the Paulson School of Engineering and Applied Sciences (SEAS) at Harvard University and currently serves as Area Chair for Electrical Engineering. He received his BS, MS, and PhD degrees in Electrical Engineering from Stanford University. His research interests span multiple layers of a computing system: mixed-signal integrated circuits, computer architecture, and design tools for efficient hardware. His research efforts focus on identifying synergistic opportunities across these layers to develop energy-efficient solutions for a broad range of systems from flapping-wing microrobots to machine learning hardware for IoT devices to large-scale servers.

## OUTLINE

---



**10:30 AM SC2: Algorithm and Implementation Co-Design for Learning and Inference**  
*Marian Verhelst, KU Leuven, Heverlee, Belgium*

As deep learning comes with significant computational complexity, only relatively recently has this technology become feasible on power-hungry server platforms. In the past few years, we have seen a trend from server-based processing towards embedded processing of the computation for deep learning networks. It is crucial to understand that this evolution is not enabled by either novel architectures or novel deep learning algorithms alone. The breakthroughs clearly come from a close co-optimization between algorithms and implementation architectures. In this presentation, we will review a wide range of recent techniques to a) make the learning algorithms implementation-aware and b) make the hardware implementations algorithm-aware.

**Marian Verhelst** is assistant professor at MICAS – KU Leuven, Belgium. Her research focuses on embedded machine learning, low-power sensing, and processing for the internet of things. Marian is member of the Young Academy of Belgium, the ISSCC and DATE executive committees, and is an associate editor of JSSC.



**1:20 PM SC3: Efficient Edge Solutions for Deep Learning Applications**  
*Vivienne Sze, Massachusetts Institute of Technology, Cambridge, MA*

Visual object detection and recognition are needed for a wide range of applications including robotics/drones, self-driving cars, smart Internet of Things, and portable/wearable electronics. For many of these applications, local embedded processing is preferred due to privacy or latency concerns. This talk will describe methods to enable energy-efficient processing of deep convolutional neural networks (CNN), as such networks form the cornerstone of many deep-learning algorithms. While CNNs deliver record-breaking accuracy for many computer vision tasks, they require significant compute resources due to the size of the networks (e.g., hundreds of megabytes for filter weights storage and 30k-600k operations per input pixel). We will give a short overview of the key concepts in CNNs, discuss the computational challenges CNNs present, particularly in the embedded space, and highlight various opportunities where hardware designers can help to address these challenges.

**Vivienne Sze** is an Associate Professor at MIT in the Electrical Engineering and Computer Science Department. Her research interests include energy-aware signal processing algorithms, and low-power circuit and system design for multimedia applications such as computer vision, autonomous navigation, machine learning and video compression. Prior to joining MIT, she was a Member of Technical Staff in the R&D Center at TI, where she developed algorithms and hardware for the latest video coding standard H.265/HEVC. She is a co-editor of the book entitled, “High Efficiency Video Coding (HEVC): Algorithms and Architectures” (Springer, 2014).

Dr. Sze received the B.A.Sc. degree from the University of Toronto in 2004, and the S.M. and Ph.D. degree from MIT in 2006 and 2010, respectively. In 2011, she was awarded the Jin-Au Kong Outstanding Doctoral Thesis Prize in electrical engineering at MIT for her thesis on “Parallel Algorithms and Architectures for Low Power Video Decoding”. She is a recipient of the 2017 Qualcomm Faculty Award, 2016 Google Faculty Research Award, 2016 AFOSR Young Investigator Award, 2016 3M Non-tenured Faculty Award, 2014 DARPA Young Faculty Award, 2007 DAC/ISSCC Student Design Contest Award and a co-recipient of the 2016 MICRO Top Picks Award and 2008 A-SSCC Outstanding Design Award.

More information about our research in the Energy-Efficient Multimedia Systems group can be found at: <http://www.rle.mit.edu/eems/>



**3:20 PM SC4: Efficient Alternatives and Extensions to Deep-Learning-Based Solutions**  
*Naveen Verma, Princeton University, Princeton, NJ*

Deep-learning systems have had profound impacts in a broad range of applications. However, it is important to remember that these represent only one class of machine learning. In this segment of the short course, we start by probing what the critical attributes are of deep learning, and what challenges in modeling and inference they solve. We then go on to consider the limitations of deep learning in emerging applications involving on-line learning (e.g. reinforcement learning with embedded sensors) – namely the need for a large number of training instances and the need for very low energy. This motivates alternatives or extensions to deep learning, which make use of other forms of learning to enhance training and energy efficiency. Given the need for very low energy in many applications, we explore how the statistical-learning can enable new hardware architectures, substantially overcoming the tradeoffs limiting conventional architectures for sensing and computation. Finally, having examined how algorithmic techniques can enhance systems, we look at how systems, and emerging technologies for sensing, can enhance algorithms. As an illustration, we consider how object-associated sensing, as enabled by IoT devices, has the potential to provide semantic structure, leading to features that can enhance the generalization of learning with simpler and easier-to-train models.

**Naveen Verma** received the B.A.Sc. degree in Electrical and Computer Engineering from the University of British Columbia, Vancouver, Canada in 2003, and the M.S. and Ph.D. degrees in Electrical Engineering from the Massachusetts Institute of Technology in 2005 and 2009 respectively. Since July 2009 he has been with the department of Electrical Engineering at Princeton University, where he is currently an Associate Professor. His research focuses on advanced sensing systems, including low-voltage digital logic and SRAMs, low-noise analog instrumentation and data-conversion, large-area sensing systems based on flexible electronics, and low-energy algorithms for embedded inference, especially for medical applications. Prof. Verma is a Distinguished Lecturer of the IEEE Solid-State Circuits Society, and serves on the technical program committees for ISSCC, VLSI Symp., DATE, and the IEEE Signal-Processing Society (DISPS). Prof. Verma is a recipient or co-recipient of the 2006 DAC/ISSCC Student Design Contest Award, the 2008 ISSCC Jack Kilby Paper Award, the 2012 Alfred Rheinstein Junior Faculty Award, the 2013 NSF CAREER Award, the 2013 Intel Early Career Award, the 2013 Walter C. Johnson Prize for Teaching Excellence, the 2013 VLSI Symp. Best Student Paper Award, the 2014 AFOSR Young Investigator Award, the 2015 Princeton Engineering Council Excellence in Teaching Award, and the 2015 IEEE Trans. CPMT Best Paper Award.

# INDEX TO AUTHORS

## A

A, Rajagopal K. 158  
Abdelfattah, Moataz 438  
Abdo, Ibrahim 168  
Abe, Kenichi 336  
Abe, Mitsuhiro 336  
Abe, Yutaka 82  
Abiri, Behrooz 404  
Acharya, Sunil 94  
Adepu, Prabhu 94  
Afshar, Bagher 66  
Afshari, Ehsan 372  
Agarwal, Pawan 406  
Aggarwal, Vipin 66  
Agger, Elizabeth R. 292  
Agi, Iskender 94  
Agrawal, Abhishek 400  
Ahmed, Mostafa G. 392  
Ahmed, Muneeb 94  
Ahn, Min-Su 204  
Akkaya, Nail Etkin Can 128  
Akkaya, Onur 94  
Albasini, Guido 112  
Ali, Gazi 94  
Ali, Sheikh Nijam 406  
Alioto, Massimo 44  
Alldred, David 162  
Alpman, Erkan 46  
Amin, Sally Safwat 144  
Amravati, Anvesha 124  
An, Jae-Sung 182  
Anand, Tejasvi 268  
Anders, Jens 354  
Ando, Kota 216  
Ando, Yoshinori 484  
Ando, Yuki 442  
Ansary, Maged El 288  
Aoki, Tsuguhide 442  
Aoyama, Satoshi 90  
Aoyama, Hiromitsu 442  
Aparin, V. 70  
Arai, Makoto 442  
Arbabian, Amin 454  
Arvind 42  
Aschieri, Julian 56  
Aseeri, Mohammed 372  
Aseron, Paolo 46  
Asgaran, Saman 106  
Ashburn, Michael 230  
Ashida, Mitsuyuki 442  
Atsumi, Tomoaki 484  
Audoglio, Walter 112  
Augustine, Charles 38  
Azarenkov, Leonid 46

## B

Bachmann, Christian 446  
Badami, Komail 344  
Badar, John 36  
Bae, Hyeon-Min 264, 270  
Bae, Joonsung 282  
Bae, Jooyoung 212  
Bae, Seung-Jun 204  
Baek, Jongbeom 434  
Baek, Kwang-Hyun 182  
Baek, Sanghyun 172  
Baek, Seung Geun 208  
Baeyens, Yves 74  
Bahr, Bichoy 348  
Bal, Steven R. 162  
Balankutty, Ajay 102  
Bamji, Cyrus S. 94  
Ban, Koichiro 442  
Banerjee, Utsav 42  
Bang, Jinbae 340  
Bang, Jooeun 366  
Bang, Sam-Young 204  
Bankman, Daniel 222  
Banna, Srinivasa 348  
Barbieri, Tommaso 426  
Barker, N. Scott 452  
Bassi, Matteo 368, 376  
Bassirian, Pouyan 452  
Baudot, Charles 350  
Baylon, Joe 406  
Bebek, Ozkan 180  
Beck, Noah 40  
Beigné, Edith 304  
Bekele, Ade 378  
Bell, Brian 36  
Belotti, Oscar 112  
Bera, Deep 186  
Bernabé, Stéphane 350  
Berry, Christopher 36  
Berti, Laurent 464  
Bertran, Ramon 300  
Bertulessi, Luca 248, 252  
Besoli, Alfred Grau 66  
Bessire, Bänz 98  
Bevilacqua, Andrea 376  
Bhandari, Saurabh 46  
Bhardwaj, Sachin 158  
Bhartiya, Mukesh 46  
Bhatara, Sumeer 158  
Bhatia, Karan 158  
Bidermann, William 86  
Biswas, Avishek 488  
Blaauw, David 328, 480  
Boers, Michael 66  
Bonizzoni, Edoardo 426  
Bosch, Johan G. 186

Boser, Bernhard E. 178

Bostamam, Anas 86  
Botti, Edoardo 426  
Bourke, Donal 286  
Boutafa, Laura 350  
Bowers, Steven M. 452  
Brady, Frederick 86  
Braeken, Dries 464  
Braendli, Matthias 104, 266, 358  
Brambilla, Davide Luigi 426  
Breen, Dan 158  
Brinkmann, Ben 460  
Buckley, William 242  
Buckwalter, James 174  
Bui, QuangDiep 172  
Bushnaq, Sanad 336  
Buyuktosunoglu, Alper 300  
Byeon, Dae-Seok 340  
Byeon, Sang-Yeon 210  
Byun, San-Ho 184

**C**

Cacciagrano, Paolo 426  
Cai, Yifeng 148  
Cai, Zeyu 332  
Calhoun, Benton H. 452  
Cameron 64  
Cao, Ying 378  
Carey, Declan 274  
Carey, Sean 36, 300  
Carlen, Peter L. 288  
Carpenter, Thomas M. 188  
Carusone, Anthony Chan 110  
Casey, Ronan 274  
Casper, Bryan 164, 166  
Castany, Olivier 350  
Cathelin, Andreia 372  
Cauwenberghs, Gert 470  
Cavusoglu, Cenk 180  
Cerdeira, Joao P. 346  
Cevrero, Alessandro 104, 266, 358  
Cha, Jin-Youp 210  
Cha, Sanguhn 206  
Chae, Youngcheol 322  
Chakrabarti, Anandaroop 166  
Chakraborty, K. 70  
Chan, Vei-Han 94  
Chan, Wei Liat 66  
Chandrakan, Anantha P. 42, 488  
Chandrakumar, Hariprasad 232  
Chang, Chih-Hsien 448  
Chang, Chih-Wei 436  
Chang, Chih-Yang 478  
Chang, Chin-Hao 88  
Chang, Duckhyun 84  
Chang, Jonathan 200, 480  
Chang, Ken 108, 274, 378, 390

Chang, Mau-Chung Frank 278  
Chang, Meng-Fan 482, 494, 496  
Chang, Michael 296  
Chang, Tsung-Yung Jonathan 478  
Chang, Yun-Sheng 494  
Chang, Zu-Yao 186  
Chao, Calvin Yi-Ping 88  
Charbon, Edoardo 96  
Charbonnier, Benoît 350  
Chatterjee, Rohit 158  
Cheah, Ken 336  
Chen, Bo 160  
Chen, Bowen 408  
Chen, Chao 186  
Chen, Feng 310  
Chen, Hsin-Hung 432  
Chen, Huan-Neng 278  
Chen, Jia-Jing 496  
Chen, Ke-Horng 126, 138, 314  
Chen, Leicheng 286  
Chen, Mike Shuo-Wei 254, 362, 394  
Chen, Pai-Yu 496  
Chen, Shao-Qi 126  
Chen, Stanley 390  
Chen, Ting-Sheng 226  
Chen, Wei 34  
Chen, Wei-Chi 478  
Chen, Wei-Chung 436  
Chen, Wei-Hao 494  
Chen, Wei-Hong 66  
Chen, Xi 276  
Chen, Yen-Kai 494  
Chen, Yi-Wen 60  
Chen, Zhao 186  
Cheng, Chiao-Hung 314  
Cheong, Wooseong 338  
Cherniak, Dmytro 248, 252  
Chi, Taiyun 68, 76, 402  
Chiang, Ping-Chuan 274  
Chiang, Yen-Ning 482  
Chidambarrao, Dureseti 36  
Chien, Shih-Hsiung 60  
Chih, Yu-Der 478, 480  
Chinya, Gautham 38  
Chiu, Chao-Chang 436  
Cho, Beob-Rae 204  
Cho, Gun-hee 204  
Cho, Gyu-Hyeong 154, 192, 428, 430  
Cho, Hwasuk 122  
Cho, Jeong-hyun 192  
Cho, Jeong-Hyun 428  
Cho, Jin Hee 208  
Cho, Joo-Hwan 210  
Cho, Junho 108

# INDEX TO AUTHORS

- Cho, Minki 38  
Cho, Sangyeun 338  
Cho, Seong-Jin 206  
Cho, SeongHwan 474  
Cho, Seung-Hyun 204  
Cho, Sungwee 198  
Cho, Thomas Byunghak 172, 434  
Choi, Hanho 264  
Choi, Hyun-Su 198  
Choi, Jaehyouk 366, 396  
Choi, JaeSeung 198  
Choi, Jeonghyun 172, 434  
Choi, Ji-Su 192  
Choi, Jin-Hyeok 122, 338  
Choi, Jinwon 340  
Choi, Jung-Hwan 204, 206  
Choi, Kwang-Hee 122  
Choi, Minseong 154, 430  
Choi, Seokwoo 208  
Choi, Seouk-Kyu 204  
Choi, Steve 336  
Choi, Sung-Won 206  
Choi, Sungpill 220  
Choi, Sungwon 154, 430  
Choi, Wonchul 84  
Choi, Wonjohn 212  
Choi, Wonjun 206  
Choi, Woojun 322  
Choi, Yoon-Kyung 184  
Choi, Young 206  
Choi, Young Jae 208  
Choi, Youngdon 340  
Choi, Youra 338  
Choi, Michael 184  
Chong, Euhan 110  
Choo, Gyo Soo 340  
Choo, Kangyeop 120  
Choo, Younghwan 434  
Chou, Chung-Cheng 478  
Chou, Po-Sheng 88  
Choudhury, Debabani 166  
Chu, Anh 354  
Chu, Chao 286  
Chu, Jun-Uk 472  
Chu, Kun-Da 170  
Chu, Kyung-Ho 210  
Chu, Li-Cheng 126  
Chu, Yong-Gyu 206  
Chuang, Pierce I-Jen 300  
Chun, Ho Sung 464  
Chun, Junhyun 210, 212, 322  
Chun, Ki Chul 206  
Chung, Jinil 206  
Chung, Ki-Seok 182  
Cichanowski, Mark 36  
Clifford, Michael 56  
Clinton, Michael 200  
Coln, Michael C. W. 242  
Cong, Jason 278  
Cong, Lin 382  
Coombs, Daniel 392  
Cope, Eric 56  
Corey, Rob 460  
Cortese, Alejandro J. 292  
Courellis, Hristos 470  
Cowell, David M. J. 188  
Cuellar, Luis 38
- D**
- Dally, William J. 276  
Dandu, Krishnanshu 158  
Daneshgar, Saeid 164, 166  
Dasgupta, Kaushik 164, 166  
Davis, Tim 158  
Dayanik, M. Batuhan 250  
De, Vivek 38, 46  
Degertekin, F. Levent 188  
Dempsey, Dennis 286  
Deng, Wei 246, 444  
Deng, William 180  
Denison, Tim 460  
Depaoli, Emanuele 112  
Diclemente, Dominic 106  
Ding, Junbiao 286  
Ding, Ming 446  
Do, Jeongho 198  
Do, Sung-Geun 204  
Dokania, Rajeev 102  
Dommaraju, Sunny Raj 360  
Dong, Qing 480  
Doo, Su-Yeon 204  
Dorigo, Daniel De 462  
Dorrance, Richard 46  
Douglas, K. 70  
Du, Sijun 152  
Du, Yuan 278  
Duan, Yida 290  
Dubray, Olivier 350  
Dufou, Suzie 288  
Dunworth, J. D. 70  
Dupaix, Brian 438
- E**
- Eberhart, Hans 66, 360  
Eberlein, Matthias 318  
Eickhoff, Susan M. 300  
Elgorriaga, Igor 66  
Elkhatib, Tamer 94  
Elkholy, Ahmed 392  
Elmallah, Ahmed 392  
Elshazly, Amr 102  
Elsherbini, Adel 46  
Eminoglu, Burak 178
- Eom, Yoon-Joo 204  
Erba, Simone 112  
Erbagci, Burak 128  
Erdmann, Christophe 378  
Erett, Marc 274  
Esparza, Brando Perez 38  
Ethier, Christian 466  
Ezaki, Takayuki 80
- F**
- Fallica, Giorgio 294  
Fan, Jianxun 162  
Fang, Zhongyuan 160  
Farahani, Bahar Jalali 74  
Farley, Brendan 378  
Farzan, Kamran 106  
Fayed, Ayman 438  
Fehrmann, Elizabeth 460  
Fenton, Mike 94  
Firrincieli, Andrea 464  
Flatresse, Philippe 304  
Flory, Robert 46  
Floyd, Michael 300  
Flynn, Michael P. 250  
Foote, Kelly 460  
Francesc, Pier Andrea 104, 266, 358  
Frans, Yohan 108, 274, 378, 390  
Freear, Steven 188  
Fu, Yingying 110  
Fujii, Tatsuya 352  
Fujimura, Susumu 336  
Fujimura, Takuya 168  
Fujimura, Yuki 442  
Fujinaka, Hiroshi 82  
Fujisawa, Isao 92  
Fujita, Masahiro 484  
Fukuda, Ryo 336  
Fukushima, Tomonori 92  
Funatsu, Ryohei 90  
Furukawa, Yohei 80  
Furutani, Kazuma 484  
Futami, Shinichiro 86  
Futatsuyama, Takuya 336
- G**
- Gagnon-Turcotte, Gabriel 466  
Gai, Weixin 114  
Gampell, Dave 94  
Gandara, Miguel 234  
Gao, Jun 286  
Garcia, Manuel Moreno 98  
Gard, Kevin G. 162  
Gasnier, Pierre 150  
Gasparini, Leonardo 98  
Geary, Kevin 274  
Genov, Roman 288, 296  
Georgakis, Spiros 336
- George, Arup K. 472  
Geshwindman, Guy 66  
Geva, Ofer 36  
Ghosh, Santosh 46  
Ghovanloo, Maysam 188, 468  
Giannini, Vito 158  
Ginsburg, Brian P. 158  
Godbaz, John 94  
Gonano, Giovanni 426  
Gönen, Burak 238  
Gonugondla, Sujan Kumar 490  
Gonzalez-Jimenez, José Luis 350  
Gopal, Srinivasan 406  
Gosselin, Benoit 466  
Goto, Daisuke 442  
Graaf, Ger de 332  
Graf, Hagen 462  
Grandfield, Walter 56  
Gray, C. Thomas 276  
Grézaud, Romain 150  
Grimaldi, Luigi 248, 252  
Grogan, Gaetan Mac 418  
Groppe, David 296  
Grosse, Philippe 350  
Grzyb, Janusz 418  
Gu, Shurong 286  
Guan, Claire 66  
Gunduz, Aysegul 460  
Guo, Qingbo 180  
Guo, Ting 160  
Guo, Zheng 196  
Gupta, Ankit 46  
Gupta, Pankaj 158  
Gürleyük, Çağrı 54  
Gysel, Oliver E. 162
- H**
- Ha, Hyunsoo 294  
Ha, Kyung-Soo 206  
Haas, Michael 236  
Hagiwara, Yosuke 442  
Haibi, Hicham 336  
Hajimiri, Ali 190, 404  
Hall, Drew A. 326  
Hamada, Mototsugu 216  
Hamid, Dina 36  
Han, Gong-Heum 204  
Han, Hyun-Ki 192, 428  
Han, Hyunki 430  
Han, Jaeyeol 434  
Han, Kyuwook 338  
Han, Sang-Hyun 182  
Han, Sungmin 472  
Hanumolu, Pavan Kumar 392  
Hanzawa, Katsuhiko 86

# INDEX TO AUTHORS

- Harada, Mitsu 82  
Hashemi, Hossein 416  
Hashiguchi, Tomoharu 336  
Hashimoto, Toshifumi 336  
Hatsukawa, Kensuke 86  
Hayashi, Yu-ichi 352  
Hayashibara, Ryo 80  
He, Ai 114  
He, Dong 336  
He, Tao 230  
He, Yanbo 348  
He, Yuming 446  
Hedayati, H. 70  
Heinemann, Bernd 418  
Helleputte, Nick Van 294, 464  
Heo, Deukhyoun 406  
Heo, Jin-Seok 206  
Heo, Seungchan 172  
Herron, Jeffrey 460  
Higashi, Yumi 92  
Hillger, Philipp 418  
Hirase, Junji 82  
Hirayama, Teruo 80  
Hirose, Kazutoshi 216  
Hisada, Toshiki 336  
Ho, Chen-Yen 432  
Ho, Cheng-Ru 254, 394  
Ho, Stacy 230  
Holdenried, Chris 106  
Holyoak, Mike 74  
Homayoun, A. 70  
Honda, Katsumi 80  
Hong, Hao-Ping 432  
Hong, Seokyong 84  
Hong, Seong-Kwan 182  
Hong, Sung-Wan 430  
Hoof, Chris Van 294  
Horiguchi, Tomoya 442  
Horiuchi, Kazuhisa 442  
Hoshino, Hiroaki 442  
Hoskote, Yatin 46  
Hosoda, Sohichiro 92  
Hosono, Koji 336  
Hsiao, Chieh-Hsun 432  
Hsieh, Chih-Cheng 240, 494  
Hsieh, Hubert 34  
Hsieh, Sung-En 240  
Hsu, Chung-Lun 326  
Hsu, Kuo-Chun 436  
Hsu, Kuo-Hsiang 494  
Hsu, Tzu-Hsiang 494  
Hu, Boyu 278  
Huang, Chao-Jen 138  
Huang, Chia-Ming 314  
Huang, Chiao-Yi 88  
Huang, Hongye 444  
Huang, Min 34  
Huang, Min-Yu 68  
Huang, Ming-Yu 410  
Huang, Ping-Chen 492  
Huang, Tzu-Chi 436  
Huang, Yimin 88  
Huang, Zhiqiang 260  
Hudner, James 274  
Huh, Yeunhee 154, 428, 430  
Hui, David 300  
Huijsing, Johan H. 50, 324  
Hummerston, Derek 242  
Huott, Bill 36  
Hwang, Kyu-Dong 210  
Hyun, Jinhoon 212  
Hyun, Sangah 212  
Hyun, Seok-Hun 206
- I**
- Ibrahim, Brima 66  
Ierssel, Marcus van 106  
Inuma, Takahiro 86  
Ikeuchi, Katsuyuki 442  
Ikuta, Masaaki 442  
Im, Dain 212  
Im, Jay 390  
Imamoto, Akihiro 336  
Imani, Somayah 284  
Imoto, Tsutomu 86  
Inoue, Satoshi 336  
Inoue, Yasunori 82  
Iotti, Lorenzo 414, 456  
Isakson, John 36  
Ishii, Hirotomo 92  
Ishii, Yasuhiro 92  
Ishizu, Takahiko 484  
Ito, Rui 442  
Itoh, Tatsuo 278  
Iwagami, Yoichiro 92  
Iwai, Taisuke 168  
Iyer, Bala 38  
Iyer, Sitaraman 34
- J**
- Jackson, Bradley 46  
Jacobi, Christian 36  
Jain, Kartik 46  
Jain, Rinkle 38  
Jain, Ritesh 418  
Jain, Sanket 400  
Jain, Saurabh 44  
Jalali, Mohammad Sadegh 106  
Jalan, Saket 158  
Jang, Do-Hun 474  
Jang, Hwajun 340  
Jang, Jaeeun 282
- Jang, Jieun 212, 322  
Jang, Joonsuc 340  
Jang, Seong-Jin 204, 206  
Jang, Soo-Young 210  
Jang, Sunmin 250  
Jaussi, James 164, 166  
Je, Minkyu 472  
Jeerapan, Itthipon 284  
Jeon, Chul-Hee 204  
Jeon, Sejun 264, 270  
Jeon, Younho 264  
Jeong, Chunseok 208  
Jeong, Jaeheon 338  
Jeong, Ji-Yong 182  
Jeong, Junwon 146  
Jeong, Min-Gyu 424  
Jeong, Seokhyeon 328  
Jeoung, Heegeun 84  
Jia, Yaoyao 468  
Jiang, Chen 372  
Jiang, Yang 422  
Jo, Jonghoo 340  
Jo, Youngsin 154, 430  
Joe, Sung-min 340  
John, Naveen 302  
Johnson, Manoj 400  
Jong, Nico de 186  
Joo, Yong-Suk 210  
Joshi, Siddharth 470  
Jou, Chewnpu 278  
Ju, Yongmin 154, 430  
Juan, Kevin 66  
Jun, Jaehoon 330  
Jun, Sung-Wook 90  
Jung, Gwangrok 188  
Jung, Hangyun 206  
Jung, Hyuntaek 198  
Jung, Jonghoon 198  
Jung, Junhee 434  
Jung, Sang-Hoon 204  
Jung, Sangil 84  
Jung, Seungchul 154  
Jung, Taesub 84  
Juvekar, Chiraag 42  
Jyo, Naoki 80
- K**
- Kadomoto, Junichiro 216  
Kahrizi, Masoud 66  
Kaihotsu, Takahisa 442  
Kajihara, Hirotsugu 442  
Kamikubo, Yasunobu 80  
Kanagawa, Naoaki 336  
Kanda, Kazushige 336  
Kanehara, Hidenari 82  
Kaneko, Tetsuya 336
- Kaneko, Tohru 444  
Kang, Byung-Hoon 184  
Kang, Dong-Seok 204  
Kang, Dongku 338  
Kang, Gyeong-Gu 192, 428  
Kang, Inyup 434  
Kang, Jian 136  
Kang, Jin-Gyu 424  
Kang, Junho 330  
Kang, Kai 370  
Kang, Kyuchang 206  
Kang, Mingu 490  
Kang, Sanghoon 218  
Kang, Shinwon 164  
Kang, Taewook 328  
Kang, Woongdae 206  
Kano, Nobuo 92  
Kanthapanit, Chanitnan 46  
Karl, Eric 196  
Karmakar, Shoubhik 238  
Karnik, Tanay 46  
Kassiri, Hossein 288  
Katanbaf, Mohamad 170  
Kato, Hidetaka 86  
Kato, Kiyoshi 484  
Kato, Takayuki 442  
Kato, Yukihiro 21  
Kavilipati, Siddartha 56  
Kawabe, Naoyuki 92  
Kawahito, Shoji 90  
Kawai, Seitaro 168  
Kawai, Shusuke 442  
Kawano, Yoichi 168  
Ke, Xugang 386  
Khalil, Waleed 438  
Khan, Wasif 468  
Khellah, Muhammad 38  
Khial, Parham Porsandeh 190  
Khwa, Win-San 496  
Kikuchi, Hidekazu 80  
Kim, Bongjin 264  
Kim, Boram 210  
Kim, Bumsuk 84  
Kim, Byeong-Cheol 204  
Kim, Byungsub 122, 272  
Kim, Changhyeon 218  
Kim, Chris H. 308  
Kim, Chul 470  
Kim, Chulbum 340  
Kim, Chulwoo 146  
Kim, Daehyun 338  
Kim, Daeyeon 196  
Kim, Dongsu 434  
Kim, Euiyeol 84  
Kim, Eun-Ah 206  
Kim, Gyoutho 328

# INDEX TO AUTHORS

- Kim, Ho-joon 340  
Kim, Hoonki 198  
Kim, Hyun-Sik 192  
Kim, Hyung Seok 102  
Kim, Hyung-Jin 206  
Kim, Hyung-Kyu 204  
Kim, Hyunik 120  
Kim, Hyunjin 340  
Kim, Jae-Sung 204  
Kim, Jaehong 338  
Kim, Ji-Hoon 472  
Kim, Jihwan 102, 208  
Kim, Jihyun 120  
Kim, Jinkook 208, 210, 212  
Kim, Jisu 340  
Kim, Ju Eon 182  
Kim, Juhwan 206  
Kim, Jung Soo 182  
Kim, Jung-Wook 206  
Kim, Jungkwan 340  
Kim, Juyeop 366  
Kim, Ki-Duk 192, 428  
Kim, Kwang-Ho 336  
Kim, Kwidong 212  
Kim, Kyeongtae 212  
Kim, Kyung-Ho 206  
Kim, Kyu-Young 210  
Kim, Kyungmin 340  
Kim, Kyungryun 206  
Kim, Mi-Jo 206  
Kim, Minjeong 212  
Kim, Minseo 282  
Kim, Minseok 340  
Kim, Minsik 264  
Kim, Minsu 340  
Kim, Minsung 330  
Kim, Moosung 340  
Kim, Nahyun 340  
Kim, Sang Joon 154  
Kim, Sang-Sun 204  
Kim, Sangyeob 218  
Kim, Seonhong 322  
Kim, Seungbum 340  
Kim, Seunghyun 206  
Kim, Shine 338  
Kim, Shinwoong 122  
Kim, Soohwan 206  
Kim, Stephen 38  
Kim, Suhwan 46, 330  
Kim, Sung 302  
Kim, Sungho 212  
Kim, Tae Kyun 208  
Kim, Tae-Sung 206  
Kim, Tae-Youn 360  
Kim, Taeik 120  
Kim, Taesung 184  
Kim, Wooseok 120  
Kim, Yanghyo 278  
Kim, Yejoong 328  
Kim, Yitae 84  
Kim, Yong-Hun 204  
Kim, Yongho 198  
Kim, YongJun 204  
Kim, Yongwoon 84  
Kim, Young-Jae 206  
Kim, Young-Ju 204, 206  
Kim, Young-Sik 204, 206  
Kimura, Katsuyuki 92  
Kimura, Koji 66  
King, Ya-Chin 494  
Kitajima, Toshiaki 90  
Klotchkov, Ilya 46  
Ko, Hyung-Jong 184  
Ko, Hyungjong 120  
Ko, Jaehyun 272  
Ko, Keunsik 212  
Ko, Min-Woo 192, 428, 430  
Ko, Seungbum 206  
Kobayashi, Hiroyuki 442  
Kobayashi, Naoki 336  
Kocaman, Namik 66  
Kodavati, Venkat 66  
Koh, Insung 212  
Koh, Kwang-Jin 62  
Koh, Seok-Tae 154  
Koh, Yee 336  
Koizumi, Tomohiro 92  
Kojima, Masatsugu 336  
Kondo, Satoshi 92  
Konijnenburg, Mario 294  
Koninck, Yves De 466  
Kono, Fumihiko 336  
Koo, Jahyun 122  
Kordus, Lou 94  
Korpela, Hannu 446  
Kossel, Marcel 104, 266, 358  
Kosugi, Tomohiko 90  
Koyama, Kazushi 442  
Kremen, Vaclav 460  
Krishnamoorthy, A. 262  
Krishnaswamy, Harish 258  
Krivokapic, Zoran 348  
Ku, B.-H. 70  
Kuan, Chien-Wei 432  
Kubota, Hiroshi 92  
Kubota, Kenro 336  
Kuchta, Dan 266  
Kudva, Sudhir S. 276  
Kuhl, Matthias 462  
Kulak, Ross 158  
Kulkarni, Jaydeep 38  
Kull, Lukas 104, 266, 358  
Kumagai, Oichi 86  
Kumagaya, Takeshi 442  
Kumar, Anil 158  
Kundu, Sandipan 102  
Kundu, Somnath 308  
Kuo, Chun-Chieh 138  
Kuo, Hung-Chi 226  
Kuo, Nai-Chung 456  
Kuo, Tai-Haur 60  
Kuo, Ting-Hsun 432  
Kurian, Dileep 46  
Kuroda, Tadahiro 216  
Kurose, Daisuke 92  
Kurose, Kengo 442  
Kuzuya, Naoki 86  
Kwon, Bongjae 198  
Kwon, Dae-Han 210  
Kwon, Daehyun 172  
Kwon, Hye-Jung 204  
Kwon, Hyuk-Jun 204  
Kwon, Ji-Suk 204  
Kwon, Kyeongha 264, 270  
Kwon, Oh-Kyong 182  
Kwon, Sanghyuk 206  
Kwon, Soonwon 264  
Kwon, Woohyun 270  
Kwon, Yongmin 206  
Kyriazidou, Sissy 66  
Kyung, Kye-hyun 340
- L**
- LaCaille, Greg 414  
Lacaita, Niccolò 368  
LaCroix, MarcAndre 110  
Lai, Tony 56  
Lai, Yan-Jiun 126  
Lauwereins, Steven 344  
Law, Man-Kay 52, 422  
Lazar, Aurel A. 346  
Leblebici, Yusuf 266  
Lee, Byunghun 468  
Lee, Chan-Yong 204  
Lee, Chang-Kyo 206  
Lee, Chang-Yong 204  
Lee, Cheon An 340  
Lee, Chulseung 338  
Lee, Daniel DG 338  
Lee, Deokwoo 340  
Lee, Dong Uk 208  
Lee, Dongha 212  
Lee, Dongheon 212  
Lee, Donghyeon 132  
Lee, Duckhyung 84  
Lee, Eunryeong 212  
Lee, Gang-Sik 210  
Lee, Geun-II 210  
Lee, Han-Jun 340  
Lee, Hoi 382  
Lee, Hyun-Bae 210  
Lee, Hyung-Jin 318  
Lee, Hyung-Min 192, 428, 430  
Lee, Jaehun 172  
Lee, Jeong-Woo 204  
Lee, Jesuk 84  
Lee, Jihee 282  
Lee, Jin-Chul 184  
Lee, Jinmook 218  
Lee, Jinsu 220  
Lee, Jiwon 282  
Lee, Jong-Ho 204  
Lee, Jongmin 132  
Lee, Jongmyung 206  
Lee, Jongwoo 434  
Lee, Jung-Ho 184  
Lee, Junghyup 472  
Lee, Junha 206  
Lee, Kangbin 340  
Lee, Kwyro 192  
Lee, Kyo Yun 208  
Lee, Kyoung-Rog 282  
Lee, Kyuho 220  
Lee, Kyung-Hoon 184  
Lee, Minseob 122  
Lee, Minyeong 340  
Lee, Myung-Jae 96  
Lee, Sang-Yong 204  
Lee, Sangho 212  
Lee, Sanghoon 322  
Lee, Sehwan 472  
Lee, Seok-Hee 212  
Lee, Seok-Hee 208, 210  
Lee, SeonGeon 340  
Lee, Seonyong 340  
Lee, Seung-Hun 210  
Lee, Seung-Hwan 182  
Lee, Seungjae 340  
Lee, Seungpil 336  
Lee, Sooeun 272  
Lee, Sungjun 434  
Lee, Sunwoo 292  
Lee, Taeju 472  
Lee, Woo Young 208  
Lee, Yong-Hoon 184  
Lee, Yong-Jae 204  
Lee, Yong-Tae 322  
Lee, Yongmin 132  
Lee, Yongsu 282  
Lee, Yongsun 366, 396  
Lee, Yoonmyung 132  
Lee, YoungSeok 204  
Lee, YunKi 84  
Lee, Yunyoung 212

# INDEX TO AUTHORS

- Lei, Ka-Meng 52  
Lepin, Florent 350  
Lerdworatawee, J. 70  
Levantino, Salvatore 248, 252  
Li, Chao-Chieh 448  
Li, Chih-Feng 478  
Li, Haitong 492  
Li, Hongxing 242, 286  
Li, Jia-Fang 496  
Li, Kai-Xiang 482, 494  
Li, Pin-Yi 494  
Li, Qiang 306, 496  
Li, Sensen 402  
Li, Shaolan 234  
Li, Tso-Wei 68, 410  
Li, Wen 290, 468  
Li, Xi 302  
Li, Xu 336  
Li, Xuemin 286  
Li, Yi-An 456  
Li, Zhao 162  
Liao, Chia-Chun 448  
Liao, Yuyun 38  
Lim, Chee-Cheow 374  
Lim, Dongju 146  
Lim, Jeong-Don 340  
Lim, Jongyup 480  
Lim, Kyohyun 366  
Lim, Kyunghyun 272  
Lim, Sang-Jin 428  
Lim, Sangjin 192  
Lim, Siok Wei 108  
Lim, Soo-Bin 210  
Lim, Younghyun 366  
Lin, Chi-Hung 360  
Lin, Chorng-Jung 494  
Lin, Huan-Ting 482  
Lin, Jian-He 314  
Lin, Li-Chi 314  
Lin, Longyang 44  
Lin, Shian-Ru 126, 314  
Lin, Shih-Mei 432  
Lin, Wei-Yu 482, 494  
Lin, Winson 108  
Lin, Yen-Ting 138  
Lin, Ying-Hsi 126, 314  
Lin, Yu-Hsin 58  
Lin, Yu-Tso 448  
Lin, Zheng-Jun 478  
Lin, Ying-Hsi 138  
Ling, Bill 190  
Lips, Klaus 354  
Liu, Benyuanyi 456  
Liu, Charles Chih-Min 88  
Liu, G. 70  
Liu, Hanli 246, 444  
Liu, Huihua 370  
Liu, Kuo-Chi 138  
Liu, Liang 224  
Liu, Muqing 308  
Liu, Ningxi 452  
Liu, Qing 172  
Liu, Qiyuan 56  
Liu, Ren-Shuo 494  
Liu, Renzhi 46  
Liu, Rui 496  
Liu, Yao-Hong 446  
Liu, Yincai 286  
Liu, Zhe 160  
Lobo, Preetham 300  
Lopez, Carolina Mora 464  
Lou, Liheng 160  
Low, Khim 66  
Lu, Cho-Ying 318  
Lu, D. 70  
Lu, Feng 336  
Lu, Lizhu 286  
Lu, Shen-Fu 138  
Lu, Yan 140, 306  
Lu, Yasu 310  
Lukita, Budi 294  
Luo, Hao 46  
Luong, Howard Cam 260  
Luu, Danny 104, 266, 358  
Lyden, Colin 286  
Lyu, Liangjian 142
- M**
- M, Akhila 46  
Ma, D. Brian 386  
Ma, Shaojun 378  
Ma, Xiaofei 306  
Ma, Yu-Sheng 314  
Machado, Ruben 288  
Maddox, Mark 242  
Madi, Fatma 468  
Maeda, Shuhei 484  
Maejima, Hiroshi 336  
Maeng, Junyoung 146  
Mai, Ken 128  
Maiyuran, Subramaniam 38  
Major, Donald 360  
Mak, Pui-In 52, 118, 374, 422, 450  
Maki, Asuka 442  
Maki, Shotaro 168  
Makinwa, Kofi A. A. 50, 54, 238,  
    320, 322, 324, 332  
Malavasi, Andres 38  
Malgioglio, Frank 36  
Maloberti, Franco 426  
Manglani, Manish J. 162  
Manoli, Yiannos 148, 462  
Mao, Fangyu 140  
Marković, Dejan 232  
Martins, Rui P. 52, 118, 140, 306,  
    374, 422, 450  
Marx, Maximilian 462  
Mastrangelo, Carlos 180  
Mathai, Deepak 38  
Matsubayashi, Daisuke 484  
Matsuda, Kohei 352  
Matsumoto, Nobu 92  
Matsunaga, Yoshiyuki 82  
Matsuzawa, Akira 168, 246, 444  
Matthew, George 38  
Mavarani, Laven 418  
Mayer, Christopher M. 162  
Mayer, Guenter 36  
Mazzanti, Andrea 112, 368  
Mazzillo, Massimo 294  
Mazzini, Marco 112  
McCarthy, Shaun 94  
McCauley, Rich 94  
McEuen, Paul L. 292  
McLaren, Angus 106  
McLaurin, David J. 162  
McLeod, Scott 274  
Megawer, Karim M. 392  
Meghelli, Mounir 266  
Mehrabani, Alireza 66  
Mehta, Swati 94  
Meinerzhagen, Pascal 38  
Melek, Didem 390  
Mendon, Ashwin 38  
Menezo, Sylvie 350  
Meng, Che-Hao 432  
Menolfi, Christian 104, 266, 358  
Mercer, Timothy 66  
Mercier, Patrick P. 144, 284, 312  
Mhala, Manoj M. 88  
Mialle, Gerald 56  
Miakashi, Makoto 336  
Mikhemar, Mohyee 66  
Miller, Cory 470  
Min, Hao 142, 408  
Min, Youngsun 340  
Minagawa, Hiroe 336  
Minamoto, Takatoshi 336  
Mirabbasi, Shahriar 256  
Mirbozorgi, S. Abdollah 468  
Misoczki, Rafael 46  
Mitomo, Toshiya 442  
Mitra, Subhasish 492  
Miura, Noriyuki 352  
Miura, Tsukasa 80  
Miyake, Yasuo 82  
Miyata, Shinya 80  
Miyata, Tomoki 216
- Moallem, Meysam 158  
Mogallapu, Vishali 94  
Moiseev, Mikhail 46  
Mok, Philip K. T. 310  
Molnar, Alyosha C. 292  
Monaco, Enrico 112  
Monat, P. 70  
Mondal, Susnata 72  
Monfray, Stéphane 150  
Montalvo, Tony 162  
Moody, Jesse 452  
Moon, Chang-Rok 84  
Moon, Inkyu 206  
Moon, Seunghyun 340  
Moons, Bert 222  
Moranz, Christian 462  
Morel, Adrien 150  
Morf, Thomas 104, 266, 358  
Mori, Hiroki 442  
Morita, Makoto 442  
Morozumi, Naohito 336  
Motomura, Masato 216  
Mounaix, Patrick 418  
Mukadam, Mustansir 94  
Mukherjee, Aditya 94  
Muljono, Harry 34  
Murakami, Hirotaka 86  
Murakami, Masashi 82  
Murali, Sriram 158  
Murmann, Boris 222  
Murphy, David 66  
Musha, Junji 336  
Muthukumar, Sriram 46
- N**
- Naeem, Naveed 242  
Naffziger, Samuel 40  
Nagano, Takashi 80, 86  
Nagaraja, Satya 94  
Nagase, Masanori 90  
Nagashima, Noriaki 168  
Nagashima, Yoshikazu 92  
Nagata, Makoto 352  
Nagata, Motoki 442  
Nakamizo, Masahiko 80, 86  
Nakamura, Hiroshi 336  
Nakamura, Tetsuya 92  
Nakamura, Tomohiro 90  
Nakanishi, Kensuke 442  
Nakata, Kengo 442  
Nalam, Satyanand 196  
Nam, Sang-Pil 184  
Namkoong, Jin 108  
Namkoong, Jinyung 390  
Narang, Nakul 108  
Narevsky, Nathan 164

# INDEX TO AUTHORS

Nasir, Saad Bin 124  
Natarajan, Arun 136, 400  
Nayak, Neeraj 158  
Nayak, Sheetal 94  
Nedovic, Nikola 276  
Neto, Pedro 274  
Neves, Jose 36  
Ngo, Huy Cu 246  
Nguyen, Hao 336  
Nguyen, Huy Thong 402  
Nicoara, Angela 46  
Nigaglioni, Ricardo 36  
Nihei, Ryuichi 442  
Niknejad, Ali M. 414, 456  
Nikoofard, Ali 284  
Nishimura, Kazuko 82  
Nishino, Tatsuki 86  
Nishiuchi, Tomoko 336  
Nitta, Yoshikazu 86  
Niwa, Atsumi 86  
Nomiyama, Takahiro 434  
Nonin, Katsuya 442  
Nonis, Roberto 248  
Noothout, Emile 186

## O

O'Connor, Pat 94  
O'Donnell, John 286  
O'Leary, Gerard 296  
O'Mahony, Frank 102  
O'Neill, Arthur 36  
Ochi, Yusuke 336  
Ogawa, Koji 80  
Ogawa, Masatsugu 336  
Oh, Hwaseok 338  
Oh, Jonghoon 208, 210, 212  
Oh, Sechang 328  
Oh, Seung-Hoon 204  
Oh, Sunghoon 84  
Oh, Youngsun 84  
Ohshita, Satoru 484  
Ohyama, Toshio 86  
Oike, Yusuke 80  
Ojima, Yoshinari 92  
Okada, Kenichi 168, 246, 444  
Okuda, Takashi 484  
Okuni, Hidenori 92  
Onizuka, Kohei 442  
Oota, Yutaka 92  
Opri, Enrico 460  
Orser, Heather 460  
Ortmanns, Maurits 236, 354  
Ota, Yoshiyuki 80  
Ou, Y-C. 70  
Öwall, Viktor 224  
Owczarczyk, Paweł 300

Ozawa, Susumu 336  
Ozkaya, İlter 104, 266, 358

---

**P**

Padmanabhan, Preethi 96  
Padovan, Fabio 376  
Paek, Ji-Seon 434  
Pamula, Venkata Rajesh 302  
Pan, Bo 66  
Pan, Sining 320  
Pancrazio, Stephen 452  
Pang, Jian 168  
Paramesh, Jeyanandh 72  
Paraschou, Milam 40  
Parès, Gabriel 350  
Park, Byungjun 84  
Park, Chang-Byung 184  
Park, Changnam 198  
Park, Donghyuk 84  
Park, Euiyoung 434  
Park, H-C. 70  
Park, Heat Bit 208  
Park, Hong-June 122, 272  
Park, Hyun-Soo 204  
Park, Inho 146  
Park, J. W. 70  
Park, Jae-Koo 204  
Park, Jaechun 338  
Park, Jeongpyo 424  
Park, Jiyoон 340  
Park, Jong Seok 76  
Park, Jonghoon 340  
Park, Joonhong 212  
Park, Keon-Woo 204  
Park, Ki-Tae 338, 340  
Park, Kwang-II 204, 206  
Park, Kyeong-Bin 182  
Park, Kyung-Bae 204  
Park, Myeong-Jae 208  
Park, Se-Hong 154, 428, 430  
Park, Seunghyun 192  
Park, Shinwoong 62  
Park, Sohyun 340  
Park, Suneui 366  
Park, Sunghyun 198  
Park, Sungsoo 184  
Park, Yongha 340  
Park, Yongin 84  
Park, Yoon-Suk 206  
Park, YounSik 204  
Parkar, Zahir 158  
Parmesan, Luca 98  
Parthasarathy, Harikrishna 158  
Patterson, David 27  
Paul, Somnath 38  
Payak, Keyur 336

Payne, Andrew 94  
Pazhouhandeh, M. Reza 296  
Pedalà, Lorenzo 54  
Peng, Chia-Sheng 432  
Peng, Chung-Ching 38  
Perenzoni, Matteo 98  
Perry, Travis 94  
Pertijs, Michiel A. P. 186, 332  
Perumana, Bevin 66  
Pfeiffer, Ullrich 418  
Pham, Jennifer 106  
Pham, Toan 108  
Philips, Kathleen 446  
Pillonnet, Gaël 150, 304  
Piri, Farshad 368  
Polster, Robert 350  
Poon, Chi Fung 108  
Popov, Roman 46  
Poulton, John W. 276  
Pozzoni, Massimo 112  
Prabhu, Hemanth 224  
Prathapan, Indu 158  
Prather, Larry 94  
Putzeys, Jan 464

---

**Q**

Qian, William 94  
Qiao, Bo 234  
Qu, Guangyang 286  
Qu, Wan Yuan 430  
Quadrelli, Fabio 376  
Quelen, Anthony 150, 304  
Quijano, Eduardo 46  
Qureshi, Rizwan 34

---

**R**

Rabaey, Jan M. 290, 492  
Rabet, Bagher 174  
Raedt, Walter De 294  
Raghunathan, Namas 336  
Rahimi, Abbas 492  
Rahman, Fahim ur 302  
Raj, Mayank 274, 390  
Rajasekaran, Vijay 94  
Rajendra, Srinivas 336  
Ram, Shankar 158  
Ramachandra, Venky 336  
Ramachandran, Ashwin 268  
Ramadass, Yogesh 136  
Raman, Sanjay 62  
Ramasubramanian, Karthik 158  
Ramiah, Harikrishnan 374  
Ramos, Juan-Carlos Pena 344  
Rashid, M. Wasequr 188  
Rathfelder, Pete 56  
Ravichandran, Krishnan 38  
Ravikumar, Surej 318

Raychowdhury, Arijit 124  
Rekhi, Angad Singh 454  
Renaud, Luke 406  
Rentala, Vijay 158  
Restle, Phillip J. 300  
Reumers, Veerle 464  
Reynaert, Patrick 412  
Rhee, Cyuyeol 330  
Rhee, Yeong-Cheol 184  
Rho, Youngsik 340  
Rim, Woojin 198  
Rizzolo, Richard 36, 300  
Roche, Vincent 8  
Roldan, Arianne 108, 274  
Rooijers, Thijs 50  
Rossi, Augusto Andrea 112  
Roussel, Vincent 66  
Roy, Abhishek 452  
Royneogi, Kalapi 34  
Rozenblit, Dmitriy 66  
Rudell, Jacques C. 170  
Ryan, Joseph 38  
Ryu, Yesin 206  
Ryu, YoungHwan 340

---

**S**

Sachdev, Ritu 158  
Sadagopan, Kamala Raghavan 136  
Sadr, Saman 106  
Sai, Akihide 92  
Saigusa, Shigehito 442  
Sajjadi, Ali 66  
Sakai, Shin 80  
Sakaida, Ryota 82  
Sakakibara, Masaki 80  
Sakiyama, Kazuo 352  
Sakuma, Ryoichi 280  
Sakurai, Katsuaki 336  
Salem, Gerard 300  
Salem, Loai G. 312  
Sali, Amruta D. 318  
Salimath, Arunkumar 426  
Salvo, Barbara De 12  
Samala, Sreekanth 158  
Samori, Carlo 248, 252  
Saporito, Anthony 36  
Sarkar, Saikat 66  
Sastry, Manoj 46  
Sathe, Visvesh S. 302  
Satish, Yada 46  
Sato, Jumpei 336  
Sato, Manabu 336  
Sato, Yoshihiro 82  
Satou, Kazuhiko 336  
Satou, Yoshiaki 82  
Satti, Nagmohan 34

# INDEX TO AUTHORS

Schlecker, Benedikt 354  
Schubert, Richard P. 162  
Sebastiano, Fabio 54, 238  
Segoria, T. 70  
Seidel, Achim 384  
Sekiya, Masahiro 442  
Sen, Padmanava 66  
Seo, Jaeyoung 272  
Seo, Young-Hun 204  
Seok, Eunyoung 158  
Seok, Mingoo 346  
Seong, Taeho 396  
Seshia, Ashwin A. 152  
Seto, Ichiro 442  
Shahramian, Shahriar 74  
Shanbhag, Naresh 490  
Sharkia, Ahmad 256  
Sharma, Jahnavi 258  
Sheffield, Bryan 200  
Sheikh, Farhana 46  
Sheikhi, Erfan 294  
Shekhar, Sudip 256  
Sheng, Kai 114  
Shi, C.-J. Richard 142  
Shi, Leo 66  
Shi, Linqi 114  
Shi, Yao 328  
Shibata, Kenichi 446  
Shibata, Noboru 336  
Shih, Yi-Chun 480  
Shim, Daeyong 208  
Shim, Minseob 146  
Shim, Seokbo 212  
Shimamoto, Hiroshi 90  
Shimizu, Takahiro 336  
Shimizu, Yutaka 442  
Shimizu, Yuui 336  
Shimizu, Yuuki 336  
Shin, Chang-Ho 204, 206  
Shin, Changsik 430  
Shin, Donghyup 66  
Shin, Dongjin 340  
Shin, Dongjoo 218  
Shin, Dongseok 62  
Shin, Hoon 206  
Shin, Jaewook 390  
Shin, Jung-Bum 204  
Shin, Se-Un 154, 428, 430  
Shin, Sunhye 212  
Shindo, Yoshihiko 336  
Shishido, Sanshiro 82  
Shoji, Natsu 352  
Shouho, Makoto 82  
Shrimali, Arun 158  
Shui, Boyu 462  
Shulaker, Max M. 492

Si, Xin 496  
Sideris, Constantine 190  
Sigal, Leon 36  
Sim, Jae-Yoon 122, 272  
Singh, Amit 74  
Singh, Jasbir 158  
Singh, Rahul 72  
Singh, Rajinder 200  
Smith, Shane 438  
Snow, Dane 94  
Sohn, Young-Hoon 154, 430  
Sohn, Young-Soo 204, 206  
Soltani, Nima 288  
Song, Ki-Whan 338  
Song, Kiwhan 340  
Song, Minyoung 446  
Song, Pingyue 416  
Song, Sanquan 276  
Song, Shuang 294  
Song, Taejoong 198  
Song, William 106  
Song, Yoon-Gue 204  
Song, Yujung 206  
Sonnelitter, Robert 36  
Sowlati, Tirdad 66  
Srinivasan, Anuradha 46  
Srinivasan, Venkatesh 158  
Stanslaski, Scott 460  
Staszewski, Robert Bogdan 448  
Stefanov, André 98  
Steffan, Giovanni 112  
Stoppa, David 98  
Strach, Thomas 300  
Su, Chenxin 170  
Su, Shiyu 362  
Subburaj, Karthik 158  
Sudhakaran, Sunil R. 276  
Suematsu, Yasuhiro 336  
Sugawara, Hiroshi 336  
Sugawara, Takeshi 352  
Sugimoto, Takahiro 336  
Sugimoto, Tomohiro 92  
Sun, Nan 234  
Sun, Sheldon 46  
Sun, Xiaoyu 496  
Sun, Xun 302  
Sun, Zheng 246, 444  
Sung, Da-Wei 432  
Surprise, Jesse 36  
Suy, Hilco 332  
Suzuki, Atsushi 442  
Suzuki, Tomoya 442  
Suzuki, Toshihide 168  
Svelto, Francesco 368  
Swilam, Muhammad 438  
Sylvester, Dennis 328, 480

**T**  
Ta, Tuan Thanh 92  
Tachibana, Ryoichi 442  
Taghavi, Mohammad Hossein 106  
Takagiwa, Teruo 336  
Takahashi, Hirotsugu 80  
Takahashi, Kyosuke 442  
Takahashi, Seiji 88  
Takamaeda-Yamazaki, Shinya 216  
Takeuchi, Tomohiko 442  
Taki, Daisuke 442  
Tam, Simon M. 34  
Tan, Kee Hian 108  
Tan, KeeHian 274  
Tan, Mingliang 186  
Tang, Adrian 278  
Tang, Dexian 246, 444  
Tang, Kai 160  
Tang, Kea-Tiong 494  
Tang, Liangxiao 114  
Tang, Wei 224  
Taniguchi, Kentaro 442  
Tatani, Keiji 80  
Taura, Tadayuki 80  
Teh, Chen-Kong 442  
Tekes, Coskun 188  
Tell, Stephen G. 276  
Temam, O. 214  
Temes, Gabor 230  
Thakkar, Chintan 164, 166  
Thangadurai, Sivaram 124  
Thompson, Barry 94  
Thompson, Michael 288  
Thonnart, Yvain 350  
Tickoo, Omesh 46  
Tiebout, Marc 376  
Tochigi, Yasuhisa 80  
Toda, Anna Papio 66  
Toifl, Thomas 104, 266, 358  
Tokgoz, Korkut K. 168  
Tokunaga, Carlos 38  
Tomekawa, Yuko 82  
Tomioka, Kohei 90  
Tomizawa, Takeshi 442  
Traferro, Stefano 446  
Trexel, Paige 292  
Tsai, Marty 200  
Tsai, Tsung-Yen 126, 314  
Tsapepas, Stelios G. 300  
Tschanz, James 38  
Tseng, Pei-Ling 478  
Tu, Honyih 88  
Turker, Didem 378  
Turner, Walker J. 276

**U**  
U, Seng-Pan 140  
Ueda, Keisuke 446  
Ueyoshi, Kodai 216  
Unekawa, Yasuo 442  
Unruh, Greg 360  
Unternährer, Manuel 98  
Upadhyaya, Parag 108, 378  
Urakawa, Go 442  
**V**  
Vaidya, Vaibhav 38, 46  
Vakilian, Nooshin 66  
Valiante, Taufik A 296  
Vandergriff, Aaron 56  
Vangal, Sriram 38  
Veldhoven, Robert van 332  
Veldhoven, Robert Van 238  
Venes, Ardie 360  
Verbruggen, Bob 378  
Verhelst, Marian 222, 344  
Verma, Naveen 296  
Verweij, Martin D. 186  
Vezyrzis, Christos 300  
Vigilante, Marco 412  
Vogelmann, Patrick 236  
Vora, Sujal 34  
Vos, Hendrik J. 186  
**W**  
Wada, Takuya 80  
Wakabayashi, Hayato 86  
Wakano, Toshifumi 86  
Waki, Naoya 92  
Wakui, Taichi 336  
Waltener, Guillaume 350  
Wang, Alan 66  
Wang, Eddie 34  
Wang, Fei 68  
Wang, Hanqing 286  
Wang, Hong 38, 46  
Wang, Hua 68, 402, 410  
Wang, Joseph 284  
Wang, Jun 470  
Wang, Li 286  
Wang, Luke 110  
Wang, Sensen Li, Hua 76  
Wang, Shiwei 464  
Wang, Tom 34  
Wang, Weihan 336  
Wang, Wen-Chieh 58  
Wang, Wensong 160  
Wang, Xiaofei 196  
Wang, Xiaoyan 446  
Wang, Xiaoyang 284  
Wang, Yisheng 160

# INDEX TO AUTHORS

Wang, Yu 142  
Wang, Zhehong 480  
Warnock, James 36  
Watanabe, Kaori 92  
Watanabe, Katsuyoshi 280  
Watanabe, Takashi 90  
Weaver, Skyler 102  
Webel, Tobias 300  
Weber, Arthur 468  
Wegberg, Roland van 294  
Weijers, Jan-Willem 464  
Weinstein, Dana 348  
Wen, Shi-Jie 308  
Weyer, Daniel 250  
White, Sean 40  
Wicht, Bernhard 384  
Wiedemer, Jami 196  
Wilson, John M. 276  
Wolpert, David 36  
Won, Hyosup 264  
Won, Min-Woo 204  
Wong, H.-S. Philip 492  
Wong, Koon Lun Jackie 360  
Wong, Richard 308  
Woo, Seonghoon 338  
Woo, Young-Jin 428, 430  
Wood, Michael 36  
Woodman, Michael 38  
Worrell, Greg 460  
Wright, Andrew 42  
Wu, An-Yeu 226  
Wu, Rui 444  
Wu, Thomas 88  
Wu, Tony F. 492  
Wurster, Stefan 94

## X

Xi, Sung-Soo 210  
Xiang, Xiao 114  
Xiang, Yingfei 142  
Xie, Cheng-Yu 138  
Xie, Guangxi Ray 360  
Xie, Hongyu 66  
Ximenes, Augusto Ronchini 96  
Xiong, Liang 408  
Xu, Bruce 108  
Xu, Hesong 98  
Xu, Hongtao 408  
Xu, Jiawei 294  
Xu, Long 324  
Xu, Shelley 66  
Xu, Zhanping 94  
Xue, Cheng-Xin 494

## Y

Y.C., Rakesh 158  
Yagi, Seitaro 92  
Yakubo, Yuto 484  
Yamada, Hideki 442  
Yamada, Shuhei 166  
Yamagishi, Toshiyuki 442  
Yamaguchi, Kouichirou 336  
Yamamoto, Satoshi 80  
Yamasaki, Takahiro 90  
Yamashita, Takahiro 336  
Yamashita, Yuichiro 96  
Yamazaki, Shunpei 484  
Yan, Sheng-Hong 432  
Yanagida, Masaaki 82  
Yang, Cheng-Han 494  
Yang, En-Yu 494, 496  
Yang, Fan 310  
Yang, Hui-Kap 206  
Yang, Jaehyeok 264, 270  
Yang, Ling 286  
Yang, Lita 222  
Yang, Minhao 346  
Yang, Phil 66  
Yang, Shang-Hsien 126  
Yang, Shiheng 118, 450  
Yang, Tzu-Hsien 482  
Yang, Wen-Hau 126, 138  
Yang, Yujin 154  
Yashima, Daisuke 442  
Yasue, Toshio 90  
Yasufuku, Tadashi 336  
Yaung, D. N. 96  
Ye, Dawei 142  
Ye, Jae-Hun 182  
Yeh, Che-Hao 138  
Yeh, Chung-Heng 346  
Yeh, Shang-Fu 88  
Yeknami, Ali Fazli 284  
Yi, Haidong 450  
Yim, Dae-Sik 206  
Yin, Jun 118, 374, 450  
Yin, Yun 408  
Yoo, Changsik 424  
Yoo, Hoi-Jun 218, 220, 282  
Yoo, Seyeon 396  
Yoon, Byung Kuk 208  
Yoon, Chanho 338  
Yoon, Heein 366  
Yoon, Hyunchul 206  
Yoon, Insik 124  
Yoon, Jong Shik 198  
Yoon, Jong-Hyeok 270  
Yoon, Jonghyeok 264  
Yoon, Kyu-Seok 192

## Z

Zarghami, Majid 98  
Zhang, Geoff 108  
Zhang, Hongtao 108, 274  
Zhang, Hongyang 112  
Zhang, Jingzhi 370  
Zhang, Joy 66  
Zhang, Peng 446  
Zhang, Shayan 200  
Zhang, Tong 170  
Zhang, Wenfeng 108  
Zhang, Yi 230  
Zhang, Yiqun 480  
Zhang, Zhengya 224  
Zhao, Bo 456  
Zhao, Chenxi 370  
Zhao, Franklin 56  
Zhao, Haibing 274  
Zhao, Hongyuan 274  
Zhao, Li 46  
Zhao, Wenxu 276  
Zhao, Yimiao 286  
Zheng, Yuanjin 160  
Zhou, Lei 390  
Zhou, Yiyin 346  
Zhu, Haiyang 162  
Zhu, Yiting 408  
Zhuang, Ian 390  
Zid, Mounir 350  
Zimmer, Brian 276  
Zimmer, Thomas 418  
Zoellin, Christian 36  
Zou, Chris 38

# EXECUTIVE COMMITTEE



**CONFERENCE CHAIR**  
*Anantha Chandrakasan*  
Massachusetts Institute  
of Technology  
Cambridge, MA



**DEMONSTRATION  
SESSION CHAIR**  
*Uming Ko*  
MediaTek  
Austin, TX



**DIRECTOR OF  
PUBLICATIONS**  
*Laura Fujino*  
University of Toronto  
Toronto, Canada



**EXECUTIVE COMMITTEE  
SECRETARY,  
DATA TEAM, SRP CHAIR**  
*SeongHwan Cho*  
KAIST  
Daejeon, Korea



**ADCOM REPRESENTATIVE AND  
ALUMNI EVENT COORDINATOR**  
*Jan van der Spiegel*  
University of Pennsylvania  
Philadelphia, PA



**PRESS LIAISON AND ARC  
CHAIR**  
*Kenneth C. Smith*  
University of Toronto  
Toronto, Canada



**PRESS COORDINATOR**  
*Denis Daly*  
Omni Design Technologies



**ITPC FAR EAST REGIONAL  
CHAIR**  
*Sungdae Choi*  
SK Hynix Semiconductor  
Icheon-si, Korea



**FORUMS CHAIR**  
*Andreia Cathelin*  
STMicroelectronics  
Crolles Cedex, France



**WEB SITE AND A/V CHAIR**  
*Trudy Stetzler*  
Houston, TX



**ITPC FAR EAST REGIONAL  
VICE-CHAIR**  
*Tai-Cheng Lee*  
National Taiwan University  
Taipei, Taiwan



**EDUCATION CHAIR  
(SHORT COURSE-TUTORIALS)**  
*Ali Sheikholeslami*  
University of Toronto  
Toronto, Canada



**DIRECTOR OF FINANCE  
AND BOOK DISPLAY  
COORDINATOR**  
*Bryant Griffin*  
Penfield, NY



**ITPC EUROPEAN REGIONAL  
CHAIR**  
*Marian Verhelst*  
KU Leuven  
Heverlee, Belgium



**DIRECTOR OF OPERATIONS**  
*Melissa Widerkehr*  
Widerkehr and Associates  
Montgomery Village, MD



**PROGRAM CHAIR**  
*Alison Burdett*  
Sensium Healthcare  
Abingdon, United Kingdom



**ITPC EUROPEAN REGIONAL  
VICE CHAIR**  
*Kostas Doris*  
NXP  
Eindhoven, The Netherlands



**PROGRAM VICE-CHAIR**  
*Eugenio Cantatore*  
Eindhoven University of  
Technology  
Eindhoven, The Netherlands



**ADCOM REPRESENTATIVE**  
*Bryan Ackland*  
Stevens Institute of Tech.  
Hoboken, NJ

## TECHNICAL EDITORS

*Jason H. Anderson*, University of Toronto, Toronto, Canada  
*Leonid Belostotski*, The University of Calgary, Calgary, Canada  
*Dustin Dunwell*, Huawei Technologies, Markham, Canada  
*Vincent Gaudet*, University of Waterloo, Waterloo, Canada  
*Glenn Gulak*, University of Toronto, Toronto, Canada  
*James W. Haslett*, The University of Calgary, Calgary, Canada  
*David Halupka*, Kapik Integration, Toronto, Canada  
*Kenneth C. Smith*, University of Toronto, Toronto, Canada

## MULTI-MEDIA COORDINATOR

*David Halupka*, Kapik Integration, Toronto, Canada

# INTERNATIONAL TECHNICAL PROGRAM COMMITTEE

**PROGRAM CHAIR:** *Alison Burdett*, Sensium Healthcare, Abingdon, United Kingdom

**PROGRAM VICE CHAIR:** *Eugenio Cantatore*, Eindhoven University of Technology, Eindhoven, The Netherlands

## Analog Subcommittee

**Chair:** *Kofi Makinwa*, Delft University of Technology, Delft, The Netherlands

*David Blaauw*, University of Michigan, Ann Arbor, MI

*Youngcheol Chae*, Yonsei University, Seoul, Korea

*Vadim Ivanov*, Texas Instruments, Tucson, AZ

*Mahdi Kashmiri*, Robert Bosch, Palo Alto, CA

*Taeik Kim*, Samsung Electronics, Hwaseong, Korea

*Man-Kay Law*, University of Macau,Taipa, Macau, China

*Yiannos Manoli*, University of Freiburg - IMTEK, Freiburg, Germany

*Tim Piessens*, ICsense, Leuven, Belgium

*Yong Ping Xu*, National University of Singapore, Singapore

*Young-Sub Yuk*, SK Hynix, Icheon-si, Korea

## Data Converters Subcommittee

**Chair:** *Un-Ku Moon*, Oregon State University, Corvallis, OR

*Kostas Doris*, NXP, Eindhoven, The Netherlands

*Paul Ferguson*, Analog Devices, Wilmington, MA

*Michael Flynn*, University of Michigan at Ann Arbor, Ann Arbor, MI

*Pieter Harpe*, Eindhoven University of Technology, Eindhoven, The Netherlands

*Stéphane Le Tual*, STmicroelectronics, Crolles Cedex, France

*Tai-Cheng Lee*, National Taiwan University, Taipei, Taiwan

*Takashi Oshima*, Hitachi, Tokyo, Japan

*Seung-Tak Ryu*, KAIST, Daejeon, Korea

*Yun-Shiang Shu*, Mediatek, Hsinchu City, Taiwan

*Venkatesh Srinivasan*, Texas Instruments, Dallas, TX

*Matt Straayer*, Maxim Integrated Products, North Chelmsford, MA

*Seng-Pan (Ben) U*, University of Macau, Macau

*Bob Verbruggen*, Xilinx, Dublin, Ireland

*Jan Westra*, Broadcom, Bunnik, The Netherlands

## Digital Architectures & Systems Subcommittee

**Chair:** *Byeong-Gyu Nam*, Chungnam National University, Daejeon, Korea

*Thomas Burd*, Advanced Micro Devices, Santa Clara, CA

*Hsie-Chia Chang*, National Chiao Tung University, HsinChu, Taiwan

*Christopher Gonzalez*, IBM, Yorktown Heights, NY

*Wookyeong Jeong*, Samsung, Hwasung-City, Korea

*Muhammad Khellah*, Intel, Hillsboro, OR

*Dejan Markovic*, University of California, Los Angeles, Los Angeles, CA

*Mahesh Mehendale*, TI India, Bangalore, India

*Masato Motomura*, Hokkaido University, Sapporo, Japan

*James Myers*, ARM, Fulbourn, United Kingdom

*Marian Verhelst*, KU Leuven, Heverlee, Belgium

## Digital Circuits Subcommittee

**Chair:** *Edith Beigné*, CEA-LETI, Grenoble, France

*Keith Bowman*, Qualcomm, Raleigh, NC

*Vivek De*, Intel, Hillsboro, OR

*Wim Dehaene*, Kuleuven-MICAS, Leuven, Belgium

*Koji Hirairi*, Sony LSI Design, Atsugi, Japan

*John Maneatis*, True Circuits, Los Altos, CA

*Phillip Restle*, IBM T. J. Watson Research Center, Yorktown Heights, NY

*Youngmin Shin*, Samsung, Hwansung, Korea

*Hirofumi Shinohara*, Waseda University, Fukuoka, Japan

*Dennis Sylvester*, University of Michigan, Ann Arbor, MI

*Ping-Ying Wang*, CMOS-Crystal, Hsinchu, Taiwan

*Kathy Wilcox*, AMD, Boxborough, MA

## IMMD Subcommittee

**Chair:** *Makoto Ikeda*, University of Tokyo, Tokyo, Japan

*Gert Cauwenberghs*, University of California, San Diego, La Jolla, CA

*Calvin Yi-Ping Chao*, TSMC, Hsinchu, Taiwan

*Yoon-Kyung Choi*, Samsung, Hwaseong, Korea

*Peng Cong*, Alphabet, Mountain View, CA

*Jun Deguchi*, Toshiba Memory, Kawasaki, Japan

*Keith Fife*, 4Catalyzer, Guilford, CT

*Michael Kraft*, MICAS, Leuven, Belgium

*Pedram Lajevardi*, Robert Bosch, Palo Alto, CA

*Masayuki Miyamoto*, Wacom, Tokyo, Japan

*Pedram Mohseni*, Case Western Reserve University, Cleveland, OH

*Matteo Perenzoni*, Fondazione Bruno Kessler, Trento, Italy

*Esther Rodriguez-Villegas*, Imperial College London, London, United Kingdom

*Joseph Shor*, Bar Ilan University, Ramat Gan, Israel

*Nick Van Helleputte*, imec, Leuven, Belgium

*Hayato Wakabayashi*, Sony Electronics, San Jose, CA

*Peter Chung-Yu Wu*, National Chiao Tung University, Hsinchu, Taiwan

## Memory Subcommittee

**Chair:** *Leland Chang*, IBM T. J. Watson Research Center, Yorktown Heights, NY

*Seung-Jun Bae*, Samsung, Hwasung, Korea

*Jonathan Chang*, TSMC, Hsinchu, Taiwan

*Meng-Fan Chang*, National Tsing-Hua University, Hsinchu, Taiwan

*Sungdae Choi*, SK Hynix Semiconductor, Icheon, Korea

*Fatih Hamzaoglu*, Intel, Hillsboro, OR

*Takashi Kono*, Renesas, Tokyo, Japan

*Dong Uk Lee*, SK hynix, Icheon, Korea

*Yan Li*, Western Digital, Milpitas, CA

*Ki-Tae Park*, Samsung, Hwasung, Korea

*Chun Shiah*, Etron, Hsinchu, Taiwan

*Shinichiro Shiratake*, Toshiba, Yokohama, Japan

*Wolfgang Spirk*, Micron, Munich, Germany

*Rob Sprinkle*, Google, Mountain View, CA

## Power Management Subcommittee

**Chair:** *Axel Thomsen*, Cirrus Logic, Austin, TX

*Yuan Gao*, IME, A\*STAR, Singapore

*Zhiliang Hong*, Fudan University, Shanghai, China

*Yen Hsun Hsu*, Mediatek, Hsinchu, Taiwan

*Tai-Haur Kuo*, National Cheng Kung University, Tainan, Taiwan

*Hoi Lee*, The University of Texas at Dallas, Richardson, TX

*Gerard Villar Piqué*, NXP Semiconductors, Eindhoven, The Netherlands

*Yogesh K. Ramadas*, Texas Instruments, Santa Clara, CA

*Stefano Stanzione*, imec-NL, Eindhoven, The Netherlands

*Makoto Takamiya*, University of Tokyo, Tokyo, Japan

*Bernhard Wicht*, Leibniz Universitaet Hannover, Hannover, Germany

## RF Subcommittee

**Chair:** *Piet Wambacq*, imec, Heverlee, Belgium

*Andrea Bevilacqua*, University of Padova, Padova, Italy

*Jaeyouk Choi*, Ulsan National Institute of Science Technology, Ulsan, Korea

*Krzysztof Dufrene*, Intel, Linz, Austria

*Minoru Fujishima*, Hiroshima University, Hiroshima, Japan

*Xiang Gao*, Credo Semiconductor, Milpitas, CA

*Brian Ginsburg*, Texas Instruments, Dallas, TX

*Giuseppe Gramegna*, Huawei, Mouguins, France

*Payam Heydari*, University of California, Irvine, Irvine, CA

*Chih-Ming Hung*, MediaTek, Taipei, Taiwan

## RF Subcommittee

*Abbas Komijani*, Apple, Mountain View, CA

*Harish Krishnaswamy*, Columbia University, New York, NY

*John Long*, University of Waterloo, Waterloo, Canada

*Andrea Mazzanti*, Università di Pavia, Pavia, Italy

*Kohei Onizuka*, Toshiba, Kawasaki, Japan

*Jiayoon Ru*, Broadcom, Irvine, CA

*Hyunchol Shin*, Kwangwoon University, Seoul, Korea

*Hua Wang*, Georgia Institute of Technology, Atlanta, GA

## Technology Directions Subcommittee

**Chair:** *Makoto Nagata*, Kobe University, Kobe, Japan

*Edoardo Charbon*, EPFL & QuTech, Neuchâtel, Switzerland

*Antoine Dupret*, CEA, Gif-sur-Yvette, France

*Hiroshi Fuketa*, AIST, Tsukuba, Japan

*Jan Genoe*, imec, Leuven, Belgium

*Frederic Ganesello*, STMicroelectronics, France

*Kush Gulati*, Omni Design Tech., Milpitas, CA

*Pui-In Mak*, University of Macau, Taipa, Macau

*Patrick Mercier*, University of California, San Diego, La Jolla, CA

*Shahriar Mirabbasi*, University of British Columbia, Vancouver, BC

*Shuichi Nagai*, Panasonic, Moriguchi, Japan

*Sriram Vangal*, Intel, USA

*Ingrid Verbauwhede*, KU Leuven, Leuven, Belgium

*Naveen Verma*, Princeton University, Princeton, NJ

*Long Yan*, Samsung Electronics, Hwaseong, Korea

## Wireless Subcommittee

**Chair:** *Stefano Pellerano*, Intel, Hillsboro, OR

*Pierre Busson*, ST Microelectronics, Crolles, France

*Theodoros Georgantas*, Broadcom, Athens, Greece

*Danielle Griffith*, Texas Instruments, Dallas, TX

*Xin He*, NXP, Eindhoven, The Netherlands

*Chun-Huat Heng*, National University of Singapore, Singapore

*Kyoo Hyun Lim*, FCI, Seongnam-si, Korea

*Yao-Hong Liu*, imec, Eindhoven, The Netherlands

*Howard C. Luong*, Hong Kong University of Science and Technology, Kowloon, Hong Kong

*Hideaki Majima*, Toshiba, Kawasaki, Japan

*David McLaurin*, Analog Devices, Raleigh, NC

*Arun Natarajan*, Oregon State University, Corvallis, OR

*Sudhakar Pamarti*, University of California, Los Angeles, Los Angeles, CA

*Yuu Watanabe*, Waseda University, Atsugi, Japan

*Renaldi Winoto*, Tectus, Saratoga, CA

*Alan Chi-Wai Wong*, EnSilica, Abingdon, United Kingdom

*Ken Yamamoto*, Sony, Atsugi, Japan

## Wireline Subcommittee

**Chair:** *Frank O'Mahony*, Intel, Hillsboro, OR

*Amir Amirkhany*, Samsung Semiconductor, San Jose, CA

*Hyeon-Min Bae*, KAIST, Daejeon, Korea

*Tony Chan Carusone*, University of Toronto, Toronto, Canada

*Simone Erba*, STMicroelectronics, Pavia, Italy

*Azita Emami*, California Institute of Technology, Pasadena, CA

*Yohan Frans*, Xilinx, San Jose, CA

*Pavan Kumar Hanumolu*, University of Illinois, Urbana-Champaign, Urbana, IL

*Andrew Joy*, Cavium, Aliso Viejo, CA

*Jaeha Kim*, Seoul National University, Seoul, Korea

*Mounir Meghelli*, IBM Thomas J Watson Research Center, Yorktown Heights, NY

*Roberto Nonis*, Infineon, Villach, Austria

*Sam Palermo*, Texas A&M University, College Station, TX

*Takayuki Shibasaki*, Fujitsu Laboratories, Kawasaki, Japan

*Bo Zhang*, Broadcom, Irvine, CA

# PROGRAM COMMITTEE

## EUROPEAN REGIONAL SUBCOMMITTEE

### ITPC EUROPEAN REGIONAL CHAIR

*Marian Verhelst*, KU Leuven, Heverlee, Belgium

### ITPC EUROPEAN REGIONAL VICE CHAIR

*Kostas Doris*, NXP, Eindhoven, The Netherlands

### ITPC EUROPEAN REGIONAL SECRETARY

*Yiannis Manoli*, University of Freiburg - IMTEK, Freiburg, Germany

#### MEMBERS

*Edith Beigné*, CEA-LETI, Grenoble, France

*Andrea Bevilacqua*, University of Padova, Padova, Italy

*Pierre Busson*, ST Microelectronics, Crolles, France

*Edoardo Charbon*, EPFL & QuTech, Neuchâtel, Switzerland

*Wim Dehaene*, Kuleuven-MICAS, Leuven, Belgium

*Krzysztof Dufrene*, Intel, Linz, Austria

*Antoine Dupret*, CEA, Gif-sur-Yvette, France

*Simone Erba*, STMicroelectronics, Pavia, Italy

*Jan Genoe*, imec, Leuven, Belgium

*Theodoros Georgantas*, Broadcom, Athens, Greece

*Frederic Gianesello*, STMicroelectronics, France

*Giuseppe Gramegna*, Huawei, Mougins, France

*Pieter Harpe*, Eindhoven University of Technology,

Eindhoven, The Netherlands

*Xin He*, NXP, Eindhoven, The Netherlands

*Andrew Joy*, Cavium, Aliso Viejo, CA

*Michael Kraft*, MICAS, Leuven, Belgium

*Stéphane Le Tual*, STmicroelectronics, Crolles, France

*Yao-Hong Liu*, imec, Eindhoven, The Netherlands

*Kofi Makinwa*, Delft University of Technology, Delft, The Netherlands

*Andrea Mazzanti*, Università di Pavia, Pavia, Italy

*James Myers*, ARM, Fulbourn, United Kingdom

*Roberto Nonis*, Infineon, Villach, Austria

*Matteo Perenzoni*, Fondazione Bruno Kessler, Trento, Italy

*Tim Piessens*, ICsense, Leuven, Belgium

*Gerard Villar Piqué*, NXP Semiconductors, Eindhoven, The Netherlands

*Esther Rodriguez-Villegas*, Imperial College London,

London, United Kingdom

*Joseph Shor*, Bar Ilan University, Ramat Gan, Israel

*Wolfgang Spirk*, Micron Semiconductor, Munich, Germany

*Stefano Stanzione*, imec-NL, Eindhoven, The Netherlands

*Nick Van Helleputte*, imec, Leuven, Belgium

*Ingrid Verbauwheide*, KU Leuven, Belgium

*Bob Verbruggen*, Xilinx, Dublin, Ireland

*Piet Wambacq*, imec, Heverlee, Belgium

*Jan Westra*, Broadcom, Bunnik, The Netherlands

*Bernhard Wicht*, Leibniz Universitaet Hannover, Hannover, Germany

*Alan Chi-Wai Wong*, EnSilica, Abingdon, United Kingdom

## FAR EAST REGIONAL SUBCOMMITTEE

### ITPC FAR EAST REGIONAL CHAIR

*Sungdae Choi*, SK Hynix Semiconductor, Icheon, Korea

### ITPC FAR EAST REGIONAL VICE-CHAIR

*Tai-Cheng Lee*, National Taiwan University, Taipei, Taiwan

### ITPC FAR EAST REGIONAL SECRETARY

*Makoto Takamiya*, University of Tokyo, Tokyo, Japan

#### MEMBERS

*Hyeon-Min Bae*, KAIST, Daejung, Korea

*Seung-Jun Bae*, Samsung, Hwasung, Korea

*Youngcheol Chae*, Yonsei University, Seoul, Korea

*Hsie-Chia Chang*, National Chiao Tung University, Hsinchu, Taiwan

*Jonathan Chang*, TSMC, Hsinchu, Taiwan

*Meng-Fan Chang*, National Tsing-Hua University, Hsinchu, Taiwan

*Calvin Yi-Ping Chao*, TSMC, Hsinchu, Taiwan

*Jaehyouk Choi*, Ulsan National Institute of Science Technology, Ulsan, Korea

*Yoon-Kyung Choi*, Samsung, Hwaseong, Korea

*Jun Deguchi*, Toshiba Memory, Kawasaki, Japan

*Minoru Fujishima*, Hiroshima University, Hiroshima, Japan

*Hiroshi Fuketa*, AIST, Tsukuba, Japan

*Yuan Gao*, IME, A\*STAR, Singapore,

*Chun-Huat Heng*, National University of Singapore, Singapore

*Koji Hirairi*, Sony LSI Design, Atsugi, Japan

*Zhiliang Hong*, Fudan University, Shanghai, China

*Yen Hsun Hsu*, Mediatek, Hsinchu, Taiwan

*Chih-Ming Hung*, MediaTek, Taipei, Taiwan

*Makoto Ikeda*, University of Tokyo, Tokyo, Japan

*Wookyeong Jeong*, Samsung, Hwasung, Korea

*Jaeha Kim*, Seoul National University, Seoul, Korea

*Taeik Kim*, Samsung Electronics, Hwaseong, Korea

*Takashi Kono*, Renesas, Tokyo, Japan

*Tai-Haur Kuo*, National Cheng Kung University, Tainan, Taiwan

*Man-Kay Law*, University of Macau, Taipa, Macau

*Dong Uk Lee*, SK hynix, Icheon, Korea

*Kyoo Hyun Lim*, FCI, Seongnam, Korea

*Howard C. Luong*, Hong Kong University of Science and Technology,

Kowloon, Hong Kong

*Hideaki Majima*, Toshiba, Kawasaki, Japan

*Pui-In Mak*, University of Macau, Taipa, Macau

*Maresh Mehandale*, TI India, Bangalore, India

*Masayuki Miyamoto*, Wacom, Tokyo, Japan

*Masato Motomura*, Hokkaido University, Sapporo, Japan

*Shuichi Nagai*, Panasonic, Moriguchi, Japan

*Makoto Nagata*, Kobe University Graduate School of Science, Kobe, Japan

*Byeong-Gyu Nam*, Chungnam National University, Daejeon, Korea

*Kohei Onizuka*, Toshiba, Kawasaki, Japan

*Takashi Oshima*, Hitachi, Tokyo, Japan

*Ki-Tae Park*, Samsung, Hwasung, Korea

*Seung-Tak Ryu*, KAIST, Daejeon, Korea

*Chun Shiah*, Etron, Hsinchu, Taiwan

*Takayuki Shibasaki*, Fujitsu Laboratories, Kawasaki, Japan

*Hyunchol Shin*, Kwangwoon University, Seoul, Korea

*Youngmin Shin*, Samsung, Hwansung, Korea

*Hirofumi Shinohara*, Waseda University, Kitakyushu, Japan

*Shinichiro Shiratake*, Toshiba, Yokohama, Japan

*Yun-Shiang Shu*, Mediatek, Hsinchu, Japan

*Makoto Takamiya*, University of Tokyo, Tokyo, Japan

*Seng-Pan (Ben) U*, University of Macau, Taipa, Macau

*Ping-Ying Wang*, CMOS-Crystal, Hsinchu, Taiwan

*Yuu Watanabe*, Waseda University, Atsugi, Japan

*Peter Chung-Yu Wu*, National Chiao Tung University, Hsinchu, Taiwan

*Yong Ping Xu*, National University of Singapore, Singapore

*Ken Yamamoto*, Sony, Atsugi, Japan

*Long Yan*, Samsung Electronics, Hwaseong, Korea

*Young-Sub Yuk*, SK Hynix, Icheon, Korea

## LOWER B2 LEVEL - YERBA BUENA BALLROOM



## B2 LEVEL - GOLDEN GATE HALL





# ISSCC 2019 Call for Papers



IEEE INTERNATIONAL SOLID-STATE CIRCUITS CONFERENCE

SUNDAY – THURSDAY, FEBRUARY 17 - 21, 2019 • SAN FRANCISCO MARRIOTT MARQUIS HOTEL, SAN FRANCISCO, CA

## ISSCC 2019 CONFERENCE THEME: ENVISIONING THE FUTURE

Innovative and original papers are solicited in subject areas including (but not limited to) the following:

**ANALOG:** Amplifiers, comparators, oscillators, filters, references; nonlinear analog circuits; digitally-assisted analog circuits; sensor interface circuits.

**DATA CONVERTERS:** Nyquist-rate and oversampling A/D and D/A converters.

**DIGITAL ARCHITECTURES & SYSTEMS:** Microprocessors, micro-controllers, applications processors, graphics processors; systems for communications, video and multimedia, machine-learning, deep-learning, neuromorphism, cryptographics, special function acceleration, processing-in-memory, FPGA/reconfigurable systems, system-level power management, near-threshold/subthreshold systems, digital architectures and systems for emerging applications (e.g. virtual reality, autonomous vehicles).

**DIGITAL CIRCUITS:** Building blocks for 2D/3D SoC, including: special-purpose digital circuits, intra-chip communication circuits, clock-distribution techniques, soft-error and variation-tolerant circuits; Circuits for power management in digital applications, including, digital/synthesizable voltage regulators and PLLs, digital sensors, adaptive circuits; Subthreshold and Near-threshold circuits; Circuits for neuro-computing; Hardware-security circuits including PUFs, TRNGs, crypto-circuits, side-channel-attacks mitigation.

**IMAGERS, MEMS, MEDICAL, & DISPLAY:** Image sensors and companion chips; image-sensor SoCs; MEMS-based integrated systems; ultrasonic sensors, neural interfaces and closed-loop systems; biosensors, microarrays, and lab-on-a-chip; wearable electronics; biomedical SoCs; display and touch electronics, flexible displays, and displays with integrated sensing functionality.

**MEMORY:** Static, dynamic, and non-volatile memories for stand-alone and embedded applications; memory/SSD controllers; high-bandwidth I/O interfaces; memories based on phase-change, magnetic, spin-transfer-torque, ferroelectric, and resistive materials; array architectures and circuits to improve low-voltage operation, power reduction, bit-error management, reliability, and fault tolerance; memory-subsystem enhancements, including in-memory logic functions.

**POWER MANAGEMENT:** Power control and management circuits, regulators; switched-mode power supplies, using inductive, capacitive, and hybrid techniques; energy harvesting circuits and systems; circuits for lighting.

**RF CIRCUITS and WIRELESS SYSTEMS\***: Building blocks and complete solutions at RF, mm-Wave and THz frequencies for receivers, transmitters, frequency synthesizers, transceivers, SoCs and SiPs; Innovative circuit-level and system architecture solutions for established wireless standards and future systems or applications, including wireless sensing, radar and localization.

**TECHNOLOGY DIRECTIONS:** Emerging IC and system solutions for: biomedical applications, sensor interfaces, analog signal processing, power management, computation, data storage, security, and communication; non-silicon, carbon, organic, metal-oxide-, compound, wide-bandgap-semiconductor, and nano electronics circuits; flexible, large-area, stretchable, and printable electronics; 3D integration; spintronics; quantum, optical, new-device, and non-transistor-based circuits.

**WIRELINE:** Receivers/transmitters/transceivers for wireline systems, including backplane transceivers, optical links, chip-to-chip communications, 2.5/3D interconnect, copper cable links, and equalizing on-chip links; exploratory I/O circuits for advancing data rates, power efficiency, and equalization; building blocks for wireline transceivers (such as AGCs, analog and ADC/DAC-based front ends, equalizers, clock generation and distribution circuits including PLLs, line drivers, and hybrids).

*\*Papers submitted to this category will be reviewed by either the RF or Wireless Subcommittee.*

**Submission Deadline is Monday, September 10, 2018 • 3:00PM Eastern Daylight Time (19:00 GMT)**

### STUDENT INITIATIVES

Graduate students are invited to participate in opportunities to showcase ongoing work and exchange experiences with other students and researchers from academia and industry. These include the Student Research Preview and the Silkroad Award (to a first-time student presenting author of a regular paper from an emerging region in the Far East).

**Further information including submission procedures, formats, student initiatives and deadlines can be found at  
<http://www.isscc.org>**

# ISSCC 2018 TIMETABLE

## ISSCC 2018 • SUNDAY FEBRUARY 11<sup>TH</sup>

### Tutorials

|          |                                                          |                                                                                |                                                         |
|----------|----------------------------------------------------------|--------------------------------------------------------------------------------|---------------------------------------------------------|
| 8:30 AM  | <b>T1:</b> Low-Jitter PLLs for Wireless Transceivers     | <b>T2:</b> Nonvolatile Circuits for Memory, Logic, and Artificial Intelligence | <b>T3:</b> Basics of Quantum Computing                  |
| 10:30 AM | <b>T4:</b> Error-Correcting Codes in 5G/NVM Applications | <b>T5:</b> Hybrid Design of Analog-to-Digital Converters                       | <b>T6:</b> Single-Photon Detection in CMOS              |
| 1:30 PM  | <b>T7:</b> Basics of Adaptive and Resilient Circuits     | <b>T8:</b> Fundamentals of Switched-Mode Power Converter Design                |                                                         |
| 3:30 PM  | <b>T9:</b> Digital RF Transmitters                       |                                                                                | <b>T10:</b> ADC-Based Serial Links: Design and Analysis |

### Forums

|         |                                                                    |                                                                            |
|---------|--------------------------------------------------------------------|----------------------------------------------------------------------------|
| 8:00 AM | <b>F1:</b> Intelligent Energy-Efficient Systems at the edge of IoT | <b>F2:</b> FinFETs & FDSOI – A Mixed-Signal Circuit Designer's Perspective |
|---------|--------------------------------------------------------------------|----------------------------------------------------------------------------|

Events Below in Bold Box are Included with your Conference Registration

### Evening Events

|         |                                                                               |                                                          |
|---------|-------------------------------------------------------------------------------|----------------------------------------------------------|
| 7:30 PM | <b>EE1:</b> Student Research Preview: Short Presentations with Poster Session | 8:00 PM <b>EE2:</b> Workshop on Circuits for Social Good |
|---------|-------------------------------------------------------------------------------|----------------------------------------------------------|

## ISSCC 2018 • MONDAY FEBRUARY 12<sup>TH</sup> • PAPER SESSIONS

|                                                                                                                            |                                   |                                        |                                                          |                                    |                                                |
|----------------------------------------------------------------------------------------------------------------------------|-----------------------------------|----------------------------------------|----------------------------------------------------------|------------------------------------|------------------------------------------------|
| 8:30 AM                                                                                                                    | <b>Session 1:</b> Plenary Session |                                        |                                                          |                                    |                                                |
| 1:30 PM                                                                                                                    | <b>Session 2:</b><br>Processors   | <b>Session 3:</b><br>Analog Techniques | <b>Session 4:</b><br>mm-Wave Radios<br>for 5G and Beyond | <b>Session 5:</b><br>Image Sensors | <b>Session 6:</b><br>Ultra-High-Speed Wireline |
| 12noon to 7:00 PM – Book Displays • 5:00 PM to 7:00 PM – Demonstration Session • 5:15 PM – Author Interviews • Social Hour |                                   |                                        |                                                          |                                    |                                                |
| Evening Events                                                                                                             |                                   |                                        |                                                          |                                    |                                                |

|         |                               |                                       |
|---------|-------------------------------|---------------------------------------|
| 8:00 PM | <b>EE3:</b> Industry Showcase | <b>EE4:</b> Figures-of-Merit on Trial |
|---------|-------------------------------|---------------------------------------|

## ISSCC 2018 • TUESDAY FEBRUARY 13<sup>TH</sup> • PAPER SESSIONS

|                                                                                                                              |                                                                      |                                                       |                                                              |                                                                   |                            |
|------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------|-------------------------------------------------------|--------------------------------------------------------------|-------------------------------------------------------------------|----------------------------|
| 8:30 AM                                                                                                                      | <b>Session 7:</b><br>Neuromorphic, Clocking<br>and Security Circuits | <b>Session 8:</b><br>Wireless Power<br>and Harvesting | <b>Session 9:</b><br>Wireless Transceivers<br>and Techniques | <b>Session 10:</b><br>Sensor Systems                              | <b>Session 11:</b><br>SRAM |
| 1:30 PM                                                                                                                      | <b>Session 13:</b><br>Machine Learning and Signal<br>Processing      | <b>Session 14:</b><br>High-Resolution ADCs            | <b>Session 15:</b><br>RF PLLs                                | <b>Session 16:</b><br>Advanced Optical<br>and Wireline Techniques | <b>Session 12:</b><br>DRAM |
| 10:00 AM to 7:00 PM – Book Displays • 5:00 PM to 7:00 PM – Demonstration Session • 5:15 PM – Author Interviews • Social Hour |                                                                      |                                                       |                                                              |                                                                   |                            |
| Evening Events                                                                                                               |                                                                      |                                                       |                                                              |                                                                   |                            |

|         |                                                                                               |                                                                                                 |
|---------|-----------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------|
| 8:00 PM | <b>EE5:</b> Lessons Learned – Great Circuits That Didn't Work<br>(Oops, If Only I Had Known!) | <b>EE6:</b> Can Artificial Intelligence Replace My Job? – The Dawn of a New IC Industry with AI |
|---------|-----------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------|

## ISSCC 2018 • WEDNESDAY FEBRUARY 14<sup>TH</sup> • PAPER SESSIONS

|         |                                                                      |                                                  |                                                                 |                                                   |                                                                     |
|---------|----------------------------------------------------------------------|--------------------------------------------------|-----------------------------------------------------------------|---------------------------------------------------|---------------------------------------------------------------------|
| 8:30 AM | <b>Session 18:</b><br>Adaptive Circuits and Digital<br>Regulators    | <b>Session 19:</b><br>Sensors and Interfaces     | <b>Session 20:</b><br>Flash-Memory Solutions                    | <b>Session 22:</b><br>Gigahertz Data Converters   | <b>Session 24:</b><br>GaN Drivers and Converters                    |
| 1:30 PM | <b>Session 26:</b><br>RF Techniques for<br>Communication and Sensing |                                                  | <b>Session 21:</b><br>Extending Silicon and its<br>Applications | <b>Session 23:</b><br>LO Generation               | <b>Session 25:</b><br>Clock Generation for<br>High-Speed Links      |
| 1:30 PM | <b>Session 26:</b><br>RF Techniques for<br>Communication and Sensing | <b>Session 27:</b><br>Power-Converter Techniques | <b>Session 28:</b><br>Wireless Connectivity                     | <b>Session 29:</b><br>Advanced Biomedical Systems | <b>Session 30:</b><br>Emerging Memories                             |
|         |                                                                      |                                                  |                                                                 |                                                   | <b>Session 31:</b><br>Computation in Memory<br>for Machine Learning |

10:00 AM to 3:00 PM – Book Displays • 5:15 PM – Author Interviews

## ISSCC 2018 • THURSDAY FEBRUARY 15<sup>TH</sup>

|         |                                                                                  |                                                                                        |                                                                                     |                                                                                                                |                                                             |
|---------|----------------------------------------------------------------------------------|----------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------|
| 8:00 AM | <b>Short Course:</b><br>Hardware Approaches to<br>Machine Learning and Inference | <b>F3:</b><br>Circuits and Architectures<br>for Wireless Sensing, Radar<br>and Imaging | <b>F4:</b><br>Circuit and System Techniques<br>for mm-wave Multi-Antenna<br>Systems | <b>F5:</b><br>Advanced Optical<br>Communication: From Devices,<br>Circuits and Architectures,<br>to Algorithms | <b>F6:</b><br>Advances in Energy Efficient<br>Analog Design |
|---------|----------------------------------------------------------------------------------|----------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------|

