

# Solid-State dToF LiDAR System Using an Eight-Channel Addressable, 20-W/Ch Transmitter, and a $128 \times 128$ SPAD Receiver With SNR-Based Pixel Binning and Resolution Upscaling

Shenglong Zhuo<sup>ID</sup>, Member, IEEE, Tao Xia<sup>ID</sup>, Lei Zhao, Miao Sun<sup>ID</sup>, Yifan Wu, Lei Wang, Hengwei Yu<sup>ID</sup>, Jiqing Xu, Jier Wang, Zhihong Lin, Yuan Li, Lei Qiu<sup>ID</sup>, Rui Bai, Xuefeng Chen, and Patrick Yin Chiang, Senior Member, IEEE

**Abstract**—An entire solid-state direct time-of-flight (dToF) light detection and ranging (LiDAR) system that incorporates innovations for both the transmitter (TX) and the receiver (RX) is presented in this work. For the illumination TX, we demonstrate solid-state channel addressability, which significantly reduces the transmit power and improves the ranging distance by dividing the field of view (FoV) into separately illuminated sub-regions. In the RX, we introduce single-photon avalanche diodes (SPADs) pixel binning, which enables reconfigurability of the sensor's spatial resolution. Finally, we introduce a machine learning (ML) technique that enables this pixel-binned depth sensor to upscale its spatial resolution after training/inference fusion with the intensity image. The laser diode driver (LDD) chip is implemented in the 180-nm bipolar-CMOS-DMOS (BCD) process and is capable of pumping more than 8-A peak current into a multi-junction vertical-cavity surface-emitting laser (VCSEL) array, producing up to 20.3-W optical pulses under 12.5-V supply voltage. The sensor chip is also implemented in the 180-nm BCD process with a  $128 \times 128$  SPAD array and reconfigurable pixel binning. Hardware and software co-optimization under low signal-to-noise ratio (SNR) conditions with ML-based spatial resolution upscaling is demonstrated.

**Index Terms**—3-D sensing, bipolar-CMOS-DMOS (BCD), direct time-of-flight (dToF), laser diode driver (LDD), light detection and ranging (LiDAR), machine learning (ML), multi-channel (MC), single-photon avalanche diodes (SPADs), solid-state, vertical-cavity surface-emitting laser (VCSEL).

## I. INTRODUCTION

THE applications requiring 3-D information have been growing rapidly, including secure facial authentication,

Manuscript received 7 July 2022; revised 28 September 2022 and 29 October 2022; accepted 29 November 2022. Date of publication 19 December 2022; date of current version 24 February 2023. This article was approved by Associate Editor Farhana Sheikh. This work was supported by the Shanghai 2020 “Science and Technology Innovation Action Plan” under Grant 20501120200. (Corresponding author: Patrick Yin Chiang.)

Shenglong Zhuo, Tao Xia, Lei Zhao, Miao Sun, Hengwei Yu, Jiqing Xu, Jier Wang, Zhihong Lin, Yuan Li, and Patrick Yin Chang are with the State Key Laboratory of Application-Specific Integrated Circuit (ASIC) and System, School of Microelectronics, Fudan University, Shanghai 201203, China (e-mail: pchiang@fudan.edu.cn).

Yifan Wu and Lei Qiu are with the College of Electronics and Information Engineering, Tongji University, Shanghai 201203, China.

Lei Wang, Rui Bai, and Xuefeng Chen are with PhotonIC Technologies Inc., Shanghai 201203, China.

Color versions of one or more figures in this article are available at <https://doi.org/10.1109/JSSC.2022.3227078>.

Digital Object Identifier 10.1109/JSSC.2022.3227078

augmented reality (AR) occlusion, robotic vision and simultaneous localization and mapping (SLAM), autonomous driving, and 3-D reconstruction. Light detection and ranging (LiDAR) systems based on time-of-flight (ToF) measurement using an infrared (IR) light source have been widely adopted for various real-time applications. Compared with other ranging methods, data extraction of ToF is much easier, reducing power consumption and computational time [1], [2], [3].

Generally, ToF systems can be divided into two categories: indirect ToF (iTof) and direct ToF (dToF). An iTof sensor measures the phase delay of the reflected light instead of the photon round-trip traveling time. Although iTof technique has advantages in high pixel resolution and low depth uncertainty [4], [5], [6], it also has several drawbacks, limiting its commercialization only to short-range applications (<10 m). First, since the phase measured by iTof systems is periodic and bounded to  $2\pi$ , ambiguous distance results exist when the phase is larger than  $2\pi$ . Although ranging distance can be extended using lower modulation frequencies, the depth accuracy will be degraded [7]. Second, iTof cameras suffer from multipath interference (MPI) caused by multiple paths from the illumination source to the same pixel, resulting in significant measurement errors [8]. Third, the whole pixel array of the iTof sensor is modulated at megahertz frequency with a complex clock distribution network, increasing the power consumption of the sensor.

The dToF systems send pulsed light to the target with a laser emitter and detect the reflected photons with high-sensitivity detectors such as single-photon avalanche diodes (SPADs). The distance is calculated from the round-trip traveling time of the photons measured by time-to-digital converters (TDCs). The dToF sensors can achieve a long-range detection (over 100 m), as the theoretical maximum distance is only limited by the optical power [9], [10], [11], [12], [13]. Using background (BG) noise rejection techniques to improve the signal-to-noise ratio (SNR), a detection range of up to 6 km can be achieved [9].

According to the scanning mechanism, the LiDAR systems can be classified into five categories: mechanical, micro-electromechanical system (MEMS), flash, optical phased arrays (OPAs), and focal plane array (FPA). Mechanical

LiDAR typically comprises multiple laser-detector pairs in the vertical direction, whereas 360° field of view (FoV) can be achieved by mechanical spinning in the horizontal direction [14]. By focusing optical power on a narrow beam, mechanical LiDAR benefits from higher optical power density of the laser pulses, achieving a longer distance than flash LiDAR. However, the mechanical moving components make the system bulky, while posing challenges for optical assembly and long-term reliability. For automotive LiDAR, these problems become even worse since the mechanics used are sensitive to harsh environmental conditions such as vibrations, heat, and cold. Compared with the traditional rotating scanning motors, MEMS-based LiDAR, also called the quasi-solid-state LiDAR, has unrivaled advantages in terms of size, speed, and cost, making them ideal for a broader range of applications [10], [15], [16]. All the mechanical components can be integrated into a single chip of micromirrors using the semiconductor manufacturing technology. However, the MEMS-based LiDAR systems still have moving parts in their scanners, which pose concerns for long-term reliability. Flash LiDAR has the best stability and lowest cost. However, its detection range is relatively short due to the optical power limitation imposed by current density constraint, heat accumulation, and parasitic loss. The OPA-based solid-state LiDAR is rapidly emerging for its fast scanning capability and potentially low cost. The emission angle is altered by changing the phase shift of each nano-photonic antenna on the OPA. However, up to 10000 phase shifters may be needed for automotive applications, resulting in complicated control circuits of the phase shifters and significant computational complexity. The FPA-based beam scanning has been realized with electronically scanning at either block level [17] or pixel level [18]. This illumination system resembles the mechanical scanning of the laser beam over the system FoV without any moving components. A similar illumination concept realized with an on-chip thermo-optic switching tree and a focal plane grating-based transmit array was applied to frequency-modulated continuous-wave (FMCW) LiDAR-based 3-D ranging with promising results [19].

One major drawback of SPAD dToF sensors is that they typically have orders of magnitude less spatial resolution than RGB sensors. Limited by the SPAD detector and in-pixel circuits, the pixel size is hard to scale down, resulting in a low fill factor and poor pixel density. In [20], a 25- $\mu\text{m}$  pitch, 340  $\times$  96 pixel array with 70% fill factor in the 180-nm CMOS process was realized. In [21], a 1200  $\times$  900 SPAD array with 6- $\mu\text{m}$  pixel pitch in the 65-nm CMOS technology was implemented. To further reduce the pixel size and improve the fill factor with more circuits such as pixel-level TDC and histogramming, 3-D stacked backside-illuminated (BSI) CMOS technology was used in [22], [23], [24], and [25].

To overcome the limited spatial resolution of the LiDAR system, several guided depth super-resolution (SR) imaging approaches have been reported [26], [27]. Among various methods, the machine learning (ML)-based approaches have shown good performance. In [28], the U-Net-based DepthSR-Net algorithm was developed to reconstruct a high-resolution (HR) depth map from its low-resolution



Fig. 1. Working principle and implementation of the MC dToF LiDAR system with hardware and software co-optimization.

(LR) version with the state-of-the-art performance. This U-Net architecture based on convolutional neural networks (CNNs) requires very little training data but achieves good results in image denoising and SR tasks. Ruget et al. [29] proposed an improved neural network that make use of the different features extracted from the raw histogram to provided significant improvements for image quality and resolution over a wide range of noise scenarios.

Most state-of-the-art LiDAR systems mainly focus on the sensor design [10], [21], [22], [23]. However, the optical-electrical system of LiDAR is complex, requiring hardware and software co-optimization across the entire signal chain: high-power sub-1-ns pulsed laser drivers, high-efficiency lasers, laser safety, optical lens for focusing or diffusion, high-SNR single-photon detection receiver (RX) arrays, and ML-based computational photography. In this article, we demonstrate an entire 3-D TX–RX system depicted in Fig. 1 that incorporates innovations in each of these domains. For the illumination transmitter (TX), we demonstrate solid-state channel addressability, which significantly reduces the transmit power and improves the ranging distance by dividing the FoV into separately illuminated sub-regions. In the RX, we introduce SPAD pixel binning, which enables reconfigurability of the sensor's spatial resolution based on the measured SNR. Finally, we introduce an ML technique that enables this pixel-binned depth sensor to upscale its spatial resolution after training/inference fusion with the intensity image.

The remainder of this article is organized as follows. Section II presents the overall system architecture and the TX illumination system. Section III describes the architecture and operation of the proposed SPAD sensor RX with the ML-based resolution upscaling, and Section IV presents the experimental results. Finally, Section V concludes this article.

## II. SYSTEM ARCHITECTURE AND TX ILLUMINATION

### A. Solid-State Scanning LiDAR Architecture

Fig. 1 shows the working principle and implementation of the multi-channel (MC) dToF LiDAR system. The individual



Fig. 2. Principles of solid-state scanning LiDAR system. (a) Comparison of optical pulses received by each pixel between flash and  $N$ -channel solid-state scanning LiDAR. (b) Optical design and ray tracing simulation result of the TX. (c) System timing diagram.

TX channels are digitally controlled to allow different configurations, including in particular a flash LiDAR mode when all the channels operate synchronously. Compared with the mechanical scanning LiDAR [10], [21], this solid-state TX exhibits improved reliability because all the moving parts are eliminated. Compared with a conventional flash LiDAR [23], this approach focuses each laser subsegment's energy on a smaller FoV, extending the ranging distance with the same TX power, or reducing the TX power for the same distance.

A comparison of optical pulse waveform received by each pixel between flash and  $N$ -channel solid-state scanning LiDAR is shown in Fig. 2(a). In a flash LiDAR system,  $M$  pulses with an optical power of  $P_f$  are received by each pixel. By concentrating more energy into one segmentation, the peak power of optical pulses received by each pixel in an  $N$ -channel scanning LiDAR is increased to  $N \times P_f$  and the number of pulses is reduced to  $M/N$  for the same average power. The signal-to-BG-noise ratio (SN<sub>BGR</sub>) can be given as the ratio between the average number of signal detections and the square root of the number of random noise detections during the laser pulse envelope. The SN<sub>BGR</sub> improvement is proportional to the square root of the number of channels [30]. Also, the requirement of readout bandwidth can be alleviated with resource sharing between different channels. Fig. 2(b) shows the optical design and ray tracing simulation result of the TX. A cylindrical mirror with  $f=25$  mm on the  $x$ -axis and  $f=\infty$  on the  $y$ -axis is adopted in the TX to collimate the circular Gaussian spot into a rod-like MC array. Another IR optical lens with  $f=8$  mm,  $940 \pm 10$  nm transmission, is used in the RX to cover the whole projected light array and reduce aberrations and distortions. Fig. 2(c) shows the timing diagram of the system. Both the synchronous timing and the spatial correlation between the sensor and the driver are controlled by the synchronization signals FRAME\_ON, CH\_SEL, and low voltage differential signaling (LVDS). The 1-s frame period of FRAME\_ON is divided into eight sub-frames. During the emission phase, periodic laser pulses are generated by the vertical-cavity surface-emitting laser (VCSEL) array at a repetition rate of 5 MHz, an optical peak power of 20.3 W, and an adjustable pulselength of 1.5 ns. The timestamps of the reflected photons are generated by the TDCs and read



Fig. 3. Cross section and block diagram of the proposed MC TX.



Fig. 4. Micrograph of the LDD ASIC die.

out through the digital video port (DVP) interface to the field programmable gate array (FPGA). The depth information is extracted during the data processing phase and transmitted to the personal computer (PC) for post-processing.

### B. Solid-State Scanning TX Architecture

Fig. 3 shows the cross section and block diagram of the proposed solid-state scanning TX. It includes a monolithic eight-channel triple-junction (3J) VCSEL array, a sub-1-ns MC laser diode driver (LDD) application-specific integrated circuit (ASIC), a photodiode (PD), and the cylindrical mirror in Fig. 2(b) covered above them. The PD is used to sense the backscattered laser pulse and convert it into a current signal.  $L_{VDD}$ ,  $L_{GND}$ ,  $L_{CC}$ ,  $L_{A1}$ , to  $L_{AN}$  in Fig. 3 are parasitic inductance. As the micrograph in Fig. 4 shows, to minimize

all the parasitic inductances, the VCSEL is mounted on top of the LDD die with both sides of every channel wire-bonded to the LDD integrated circuit (IC). The back side of the VCSEL is electrically connected to a ground plane on the LDD's redistribution layer (RDL) layer which also acts as a heat sink to prevent over-temperature of the VCSEL.

The LDD ASIC is designed in a 180-nm 5-V gate bipolar-CMOS-DMOS (BCD) process, as it provides both fast 1.8-V core devices and laterally diffused metal-oxide-semiconductor (LDMOS) with over 16-V operation V<sub>ds</sub> and V<sub>dg</sub>. The principle of laser driving in this work is to use LD-PMOS switches to pull up the anode of the VCSEL channel to pump the current from the VCSEL supply (LDVCC) into the VCSEL. LDVCC is generated by an OFF-chip dc-dc boost converter with its feedback network controlled by the LDD ASIC. The LDD receives pulse trigger signals in LVDS format and generates programmable ns-scale pulses ON-chip. An ON-chip optical oscilloscope (OCOO) [31] receives the current signal from the sensing PD and records the pulse waveform. With the help of this OCOO, automatic peak power control (APPC) and power-related laser safety are enabled.

### C. LDD Topology

The first challenge in designing the TX is to achieve high power and high speed (sub-1-ns pulses) at the same time. To address this challenge, first, a 3J VCSEL array is used in this TX to enhance the optical power conversion efficiency (PCE). When a 9-V forward voltage is imposed, 8.5-A current flows through each VCSEL channel to generate 20-W optical power. Second, the gates of LD-PMOSs in the LDD are driven by 5-V full-swing signals with rising/falling edges around 100–200 ps. This switching mode operation exploits the largest current density on the LDMOSs. The fast switching driver eliminates VCSEL bias current because, in low duty-cycle dToF applications, even a small bias current can waste much power. Without current sources, the current amplitude on the VCSEL is controlled by LDVCC, which is eventually controlled by the APPC loop in the LDD ASIC. Third, all the parasitic inductances in the high-current path are minimized. Thanks to the 3-D stack structure, the wiring inductance from the VCSEL cathode to the LDD ground plane ( $L_{CC}$ ) is reduced to nearly zero, whereas the inductance between LDD outputs and VCSEL channel anodes ( $L_{A1}$ – $L_{AN}$ ) is also minimized. As Fig. 4 shows, 16 LDVCC and GND pads for wire bonding are placed on each of the two short edges of the LDD chip to minimize  $L_{VDD}$  and  $L_{GND}$ .

Another difficulty of implementing solid-state MC TX is all the matured MC VCSEL arrays are common cathode and split anode, as the VCSEL emitters on the monolithic MC array intrinsically share the same cathode connection. Legacy single-channel LDDs [32], [33], [34], [35] usually exploit the high carrier mobility of NMOS to drive the cathode of the laser diode. This low-side driver (LS-LDD) can be combined with an anode switch array (Fig. 5) to drive the common cathode multi-zone VCSEL. However, the approach of the LDD in this work is to use separate PMOSs to drive the split anodes of the VCSEL channels while connecting the cathode of the array to the ground (Fig. 3). The drawbacks of high-side



Fig. 5. Legacy LS LDD with HS switches.



Fig. 6. Performance comparison of HS-LDD and LS-LDD with different numbers of channels. (a) Simulated laser current pulse. (b) Simulated power consumption for 7.9-A, 900-ps, 10-MHz current pulses.

(HS) driving are: 1) the PMOS has lower carrier mobility than NMOS and 2) the pre-driver draws current from the LDVCC thus consuming a lot of electric power. The drawbacks of LS driving are: 1) capacitors must be added to the anode of each VCSEL channel to provide the low ac impedance both when the anode switch is ON and OFF and 2) the driver needs to drive all the junction and parasitic capacitors multiplied by the number of channels.

As Fig. 6 shows, MC HS-LDDs and LS-LDDs are compared in simulation with proper parasitic capacitance and inductance included. The size of the driving NMOS in LS-LDDs is proportional to the number of channels ( $N_c$ ). The performance of HS-LDD does not change with  $N_c$  because all the channels are independent and identical. As Fig. 6(a) shows, the falling edge of the laser current pulse of LS-LDD becomes slower than HS-LDD for any  $N_c > 3$  and becomes unacceptable with  $N_c = 8$ . As Fig. 6(b) shows, the power consumption of LS-LDD exceeds that of HS-LDD for  $N_c > 6$ . The LDD in this work is designed to support up to 12 VCSEL channels. In this work, it is paired with an eight-channel VCSEL to compose an eight-channel TX. Therefore, the HS-LDD topology is chosen.



Fig. 7. Diagrams of the DLL, the pulse generator, and the MP-Gen [35].

#### D. LDD Signal Path

The pulses are generated locally on the LDD ASIC because any transmission of ns-scale pulses from outside the chip to the IC may introduce distortion. Fig. 7 [35] shows the schematic diagrams of the delay-line-based pulse generator, the delay-locked loop (DLL) which performs real-time calibration against process-voltage-temperature (PVT) for the delay line, and the multi-phase generator (MP-Gen) which is part of the OCOO. Circuits in Fig. 7 are the same as those implemented in [35]. The identical delay cells in the DLL, the pulse-Gen, and the MP-Gen are locked to 110-ps delay per cell when the DLL is working with an external reference clock. The 110-ps/step coarse tuning of pulsewidth is performed by choosing the outputs of different delay cells as the two edges of the pulse. An eight-step phase interpolator (PI) is used to perform 13.75-ps/step fine-tuning.

Fig. 8 shows the pulse distribution path of the proposed LDD. In a solid-state scanning dToF system, the consistency of every TX channel is important for the sensor to get a homogeneous depth image. By design, each channel of the VCSEL and the LDD is physically identical to make their output power as consistent as possible. The consistency of pulse timing control is guaranteed by a binary driving tree. All the pre-drivers and the distribution tree work between LDVCC and VSSH where VSSH is the high “ground” for the gate driver of high-side LD-PMOS. A rapid swing level shifter is placed prior to the tree to convert the output of the pulse generator into a full-swing signal between LDVCC and VSSH. The level shifter is optimized for low duty-cycle narrow pulses to reduce propagation delay and pulsewidth distortion.

#### E. OCOO and Auto Peak Power Control

The ON-chip oscilloscope (OCO) is a useful technique to measure a chip's internal high-speed signals. The responsive outputs of the OCO make real-time self-calibration or adaptation possible. In high-speed serial links, electrical



Fig. 8. Signal path of the eight-channel driver.



Fig. 9. Schematic and sub-sampling sequence of the OCOO [31].

OCOs are used as eye-opening monitors (EOMs) whose outputs can guide adaptive equalization of the received signal [36], [37], [38]. When measuring periodic signals, the sub-sampling architecture can be used to acquire high-speed signals with low-speed circuits without increasing the circuit complexity [39], [40], [41].

In [35], an electrical OCO is implemented in a laser driver to perform automatic peak current control. In this work, by adding a high-speed PD underneath the TX lens, a mixed-signal sub-sampling OCOO [31] is proposed. By recording the optical waveform from the TX, automatic peak power control is enabled, making the TX resistant to the PVT variations in the LDD and the VCSEL. The block diagram of the OCOO is shown in Fig. 9. The high-speed PD senses the backscattered transmitted light. The OCOO consists of a high-bandwidth transimpedance amplifier (TIA), a 10-bit successive approximation register (SAR) analog-to-digital converter (ADC),



Fig. 10. Block diagram of the SPAD sensor.

a multi-phase ADC trigger generator (Fig. 9), digital circuits, and on-chip static random access memory (SRAM). The two challenges in TIA design are bandwidth and linearity. The cascodes M0 and M1 alleviate  $C_{gd}$  Miller effect, whereas the M2 source follower lowers the output impedance. The two methods improve the input and output bandwidths, respectively. With sufficient equivalent  $G_m$  of M0 and M1, the amplifier gain is close to  $-(R_1/R_2)$ , guaranteeing TIA linearity.

As depicted by the sub-sampling sequence of the OCOO in Fig. 9, waveform recording will be done in a certain number of pulse cycles. The peak value  $P_{pk}$  of the waveform denotes the peak optical power. During laser emission, the APPC is triggered periodically. The APPC procedure will try to adjust the dc–dc control signal to make the error between  $P_{pk}$  and its target below a certain threshold. If this is not achieved in finite steps, APPC failure will be reported, meaning emission power deviates from the target. The APPC failure flag will shut down the LDD immediately. This is one laser safety function enabled by OCOO. During normal emission, the OCOO is also enabled to record peak optical power. Abnormally high or low  $P_{pk}$  may indicate a possible TX lens fracture, unwanted skin coverage, or malfunctioning of VCSEL/LDD. In this case, the LDD will also shut down immediately to guarantee laser safety.

### III. SPAD SENSOR RX IMPLEMENTATION

#### A. Chip Architecture

Fig. 10 shows the block diagram of the  $128 \times 128$  SPAD sensor. The pixel pitch of the SPAD array is  $25 \mu\text{m}$ . The measured photon detection probability (PDP) of the SPAD is  $\sim 1\%$  at  $940 \text{ nm}$ , with a dark count rate (DCR) of  $100 \text{ kcps}$  at  $70^\circ\text{C}$ . The pixel array is divided into eight channels, matching the illumination scheme of the laser TX. During the solid-state scanning process, only one channel is turned on at a time, enabling time-divided inter-channel resource sharing of the readout circuits such as TDCs and first in first out (FIFOs). Each channel is composed of a  $128 \times 16$  SPAD array, of which every  $8 \times 8$  SPAD pixels are combined as a sub-group, resulting in  $16 \times 2$  sub-groups for each channel. Each sub-group has one shared timing signal and nine address lines, indicating which SPAD is fired. The sensor chip integrates 32 shared on-chip TDCs with a timing resolution



Fig. 11. Die micrograph of the sensor chip.

of  $100 \text{ ps}$ . The outputs of the SPAD array are multiplexed to the 32 TDCs and then the output data of the TDC are synchronized by the FIFO. The delay cells inside the TDCs are matched with the delay cells in the phase-locked loop (PLL). When the PLL is locked, the delay of these cells and the timing resolution of the TDCs will remain constant regardless of the PVT variations. The data are read out by four channels of parallel input/output (IO) with a frequency of  $50 \text{ MHz}$ . The data processing of the TDC timestamps is performed on an FPGA, including data matrix, time-correlated single photon counting (TCSPC)-based histogram, and peak detection.

The frame control signals are generated on-chip with channel and pixel area selection. A high-speed LVDS interface is implemented to transmit the laser pulse trigger signal to the LDD chip. Fig. 11 shows the die micrograph of the sensor chip.

#### B. Configurable Pixel Binning and Collision Detection Bus

At the limit of sensitivity, i.e., near the maximum range of the measurement system,  $\text{SN}_{\text{BGR}} (S/\sqrt{N})$  can be used to get the detection and false alarm probabilities [42], [43]. Therefore, if  $M$  sub-pixels within a macro (binned) pixel correspond to the same target, which is equivalent to a single sub-pixel receiving  $M$  times the number of laser pulses,  $\text{SN}_{\text{BGR}}$  is increased by a factor of  $\sqrt{M}$ .

Fig. 12 shows the scheme of the configurable pixel binning and readout. Each sub-group is composed of  $8 \times 8$  SPAD pixels and the outputs of these pixels are connected to a shared timing bus. If a pixel fires during the bus dead time after the previous event fired by another pixel, a collision of the addresses of the two pixels will happen, causing an incorrect address and thus an invalid detection. Taking a 3-bit bus, for example, if two pixels with addresses "101" and "110" fire simultaneously, the merged output code would be "100," pointing to a pixel that did not fire. For this reason, a collision detection coding scheme is implemented as in [11]. The total number ( $m$ ) of collision detection codes for  $n$ -bit lines is given by the following equation, where  $k$  is the number of "1"s in



Fig. 12. Configurable pixel binning and readout.



Fig. 13. Address decoding process.

the code:

$$m = \frac{n!}{k!(n-k)!}. \quad (1)$$

To obtain the maximum number of codes,  $k$  is chosen to be the integer closest to  $n/2$ . In this design, 9-bit address buses are implemented, where each address consists of five “1”s and four “0”s, enabling 126 pixels to be coded. Each  $8 \times 8$  pixels inside the sub-group share one TDC. Theoretically, eight address lines can provide 70 non-conflicting codes. However, to facilitate the decoding process of different pixel binning modes, 9-bit address buses are used here, which will be shown in Fig. 13. The coded pixel addresses of  $8 \times 8$  pixels in a sub-group are shown in Table I.

During the decoding process, in some special cases where  $\text{addr}(3:0)$  equals “1111” or “0000,” the lower four bits of the address need to be first translated to  $\sim\text{addr}(3:0)$ . Then the translated address can be decoded as normal. Fig. 13 shows a decoding process in pixel mode 0 ( $\text{mode\_sel} = 0$ ), in which  $\text{addr}(8:0)$  is interpreted in three steps. First, within the sub-group ( $8 \times 8$ ),  $\text{addr}(8:7)$  is used to determine which macro ( $4 \times 4$ ) it belongs to. Second, within the selected macro,  $\text{addr}(6:5)$  is further used to determine which sub-macro ( $2 \times 2$ ) it belongs to. Finally, within the selected sub-macro,  $\text{addr}(2:1)$  is used to determine the accurate location of the single pixel ( $1 \times 1$ ). One thing to note here is  $\text{addr}(4:3)$  and  $\text{addr}(0)$  are used for collision detection coding. Pixel binning modes 2 ( $\text{mode\_sel} = 2$ ) and 1 ( $\text{mode\_sel} = 1$ ) need the first one and two steps, respectively.

### C. Time-to-Digital Converter

On an SPAD dToF sensor, a large number of TDCs have to be used for high-throughput parallel readout. Therefore, the

TABLE I  
ADDRESS CODE FOR  $8 \times 8$  PIXELS IN A SUB-GROUP

|      | col1      | col2      | col3      | col4      |
|------|-----------|-----------|-----------|-----------|
| row1 | 001111100 | 001111001 | 000101111 | 000111011 |
| row2 | 001110011 | 001100111 | 000111101 | 000111110 |
| row3 | 001001111 | 001011011 | 001110101 | 001101011 |
| row4 | 001011101 | 001011110 | 001101101 | 001101110 |
| row5 | 100001111 | 100011011 | 100111001 | 100101011 |
| row6 | 100011101 | 100011110 | 100101101 | 100101110 |
| row7 | 101011001 | 101011010 | 101101001 | 101101010 |
| row8 | 101010101 | 101010110 | 101100101 | 101100110 |
|      | col5      | col6      | col7      | col8      |
| row1 | 010001111 | 010011011 | 010111001 | 010101011 |
| row2 | 010011101 | 010011110 | 010101101 | 010101110 |
| row3 | 011011001 | 011011010 | 011101001 | 011101010 |
| row4 | 011010101 | 011010110 | 011100101 | 011100110 |
| row5 | 110011001 | 110011010 | 110101001 | 110101010 |
| row6 | 110010101 | 110010110 | 110100101 | 110100110 |
| row7 | 111010001 | 111010010 | 111100001 | 111100010 |
| row8 | 111010100 | 111000110 | 111100100 | 111110000 |

area and power consumption of the TDC is critical. Coarse-fine structures are commonly used to reduce power consumption and area, where coarse TDC (CTDC) is composed of ripple counter and fine TDC (FTDC) is based on multi-phase clock signals for HR. With a multi-phase clock distribution network across the TDC array, the implementation of the TDC can be simplified. However, the complexity and power consumption of the clock network increase with the number of TDCs. Another type of TDC based on gated ring oscillator (GRO) is only active during the start-stop interval [23], [44]. When the SPAD output is used as the start signal and the laser pulse as the stop signal, the TDC is event-driven which reduces the TDC dynamic power. However, this scheme only responds to the first event and suffers from photon pile-up under high BG light. Since the power consumption of the TDC array is not dominant in the system, the GRO-based TDC with laser pulse signal as the start signal is adopted in this work which is capable of detecting multiple photon events. The number of TDCs can be easily scaled up due to the clock-free structure.

Fig. 14 shows the block diagram of the TDC. The TDC core is composed of a 15-stage ring oscillator (RO). To detect the arrival of the start signal, the first cell in the delay chain is replaced by a NAND gate with one input connected to the start signal. Therefore, the turn-on time of the RO is synchronized to the rising edge of the start signal. The delay of the NAND gate is designed to match that of the other delay cells in the RO. The circuit and layout of the delay cells are replicas of those of the PLL in the sensor chip, ensuring constant time delay across PVT variations. To achieve symmetrical rising and falling edges, both the pull-up and pull-down currents of the delay cells are controlled. When the PLL is locked, the delay of each cell is regulated to 100 ps, corresponding to a distance resolution of 15 mm. The output of the last stage of the delay chain is connected to the coarse counter running at a frequency of 333 MHz. The 5-bit coarse counter has a timing range of up to 96 ns, corresponding to a maximum detection distance of 15 m. To eliminate the meta-stability problem, the stop signal is synchronized by the clock of the coarse counter before sampling the coarse counter output. An encoder is used



Fig. 14. Block diagram of the TDC.

for thermometer-to-binary conversion of the sampled data of the FTDC. The 10-bit output data are formed by the weighted summation of the coarse and fine data.

#### D. ML-Based Spatial Resolution Upscaling

To reconstruct the resolution of depth images from an SPAD sensor with pixel binning performed, we used a U-Net-based upscaling algorithm which is based on CNNs and exploited multiple features that can be extracted from the histogram data. The network then uses the intensity images and multiple features extracted from down-sampled histograms to guide the up-sampling of the depth. Pixel binning improves the SNR and depth accuracy at long distances. However, the spatial resolution of the sensor is reduced. Therefore, in our work, a customized neural network incorporating an ML-based upscaling algorithm processes the data from the LR depth and intensity image, producing an HR depth map. As shown in Fig. 15, the reconstructed depth map is the summation of the interpolated depth map and the residual depth map, which is generated by the intensity guided CNN.

Four depth features D1, D2, D3, and D4 of different resolution scales are used as input for the network. The dimensions of each feature for our real data ( $32 \times 16 \times 32$  histogram) are  $64 \times 32$ ,  $32 \times 16$ ,  $16 \times 8$ , and  $8 \times 4$ , respectively. D1 is acquired by two-times down-sampling the  $128 \times 64$  depth map obtained by four times up-sampling previously. D2 is obtained by computing the center of mass on the  $32 \times 16 \times 32$  LR histogram. D3 and D4 are obtained by down-sampling this histogram by a factor of 2 and 4 by summing the neighboring pixels and computing the center of mass on the resulting histograms.

This intensity image acquired from pixel binning mode 0 has a spatial resolution of  $128 \times 64$ , which is four times larger than the  $32 \times 16$  spatial resolution got from pixel binning mode 2. Multi-resolution depth features are integrated along the contracting path of the U-Net [28]. The intensity image is processed at multiple resolutions and integrated along the expansive part of the U-Net. Skip connections between the



Fig. 15. Spatial resolution upscaling.

contracting and expansive paths are displayed as black arrows in Fig. 15.

The goal of the network is to take the data from the LR histogram and intensity image and produce a residual map  $R$  that can be added to an up-scaled version of the LR depth map. The sum of the residual map and the LR depth map is the final HR depth map.

## IV. EXPERIMENTAL RESULTS

Fig. 16 shows the measured single-channel performance of the proposed TX. Fig. 16(a) shows a 930-ps 12-W optical pulse measured by a 5-GHz bandwidth InGaAs-based photon detector connected to a 8-GHz oscilloscope. The optical waveforms were also used to calculate the peak average power ratio (PAPR). Combined with the average optical power measured by an integrating sphere, the actual peak power can be calculated. The peak optical power and overall power efficiency of 930-ps pulses were measured at different LDVCC and plotted in Fig. 16(b). The 20.3-W peak power was reached at 12.5-V supply with 17.5% efficiency. The efficiency was over 20% for 5–17-W pulses. A programmable pulselength of 360–2060 ps was measured for 12-W pulses and plotted in Fig. 16(c). Fig. 16(d) shows two different waveforms recorded by the OCOO. Fig. 16(e) shows that the peak optical power measured by the OCOO matched well with measured externally, except



Fig. 16. TX single-channel performance. (a) 20.3-W 930-ps optical pulse measured by InGaAs PD. (b) Optical power and efficiency. (c) Pulsewidth tuning. (d) Optical waveform measured by OCOO. (e) Peak optical power measured internally and externally.



Fig. 17. Measured channel consistency of dToF TX. (a) Consistency of optical power. (b) Consistency of pulsewidth. (c) Consistency of propagation delay.



Fig. 18. APPC: optical pulses and LDVCC waveform captured by high-speed oscilloscope.

for  $LDVCC < 6$  V when the signal received by OCOO is buried under its own noise.

The optical power, pulsewidth, and propagation delay of the eight channels of one TX module were measured. Fig. 17 shows the consistency of eight TX channels.

Fig. 18 shows the real-time optical pulses and LDVCC waveform during the APPC process. The 1-ns width 10-MHz rate pulses are used in this measurement. Within the first 1.2 ms of each frame, the TX first starts to emit with the initial guess of LDVCC and detects the peak optical power with the



Fig. 19. DNL and INL of one TDC.



Fig. 20. Measured spatial resolution upscaling.

OCOO, and then adjust LDVCC to converge the optical power to the target value within several steps. After that, the normal emission starts from 2 ms of the frame. As Fig. 18 shows, for two different initial guesses of LDVCC (8.24/10.2 V), LDVCC converges to 9.55 and 9.65 V respectively, yielding the output power error between the two sets to be less than 1%.

The measured TDC non-linearity is shown in Fig. 19. From the measurement, a worst case DNL (INL) of  $-0.7 \pm 0.6$  ( $-1.3 \pm 1.4$ ) is observed. The TDC consumes a power of  $500 \mu\text{W}$  under 1.8-V supply voltage with a maximum throughput of 50 MS/s. Fig. 20 shows the RX spatial upscaling process of an HR low-SNR depth image (taken when the average signal photons and noise across all the pixels are 14 and 450, respectively). The HR image is first converted into an LR



Fig. 21. (a) Channel-scanning illumination at 5-m distance. (b) Ranging distance extension with different channel binning. (c) TX-RX test demonstration.

image with  $4 \times 4$  pixel binning and then guided by an HR intensity map to reconstruct the HR image with sufficient SNR improvement. We simulate realistic SPAD array measurements (LR histogram and HR intensity) from the MPI Sintel Depth dataset [45] for the training and validation datasets, and from six scenes of the Middlebury dataset [46] for the test dataset. The intensity and depth data are generated at the same time: first, intensity data are generated by photon counting at a resolution of  $128 \times 128$ ; second, depth data are generated based on multi-event TCSPC histograms with a configurable resolution. The data processing algorithm adds large latency overhead mainly due to large model sizes and redundant parameters. Other upscaling algorithms [27] have shown the potential of controlling the overall processing time under 50 ms, making upscaling more suitable for real-time scenarios. The ML-based upscaling algorithm is currently running on the PC which also limits the speed of post-processing. An ASIC implementation is under development for real-time data post-processing in future work.

Fig. 21(a) demonstrates the eight-channel scanning illumination at 5-m distance. Fig. 21(b) shows the ranging distance extension with different channel binning, including flash mode (all the channels binned together),  $2 \times 4$  mode with bins of four channels,  $4 \times 2$  mode with bins of two channels, and  $8 \times 1$  mode without any binning. Based on the calculation and measurement results, we can conclude that an MC system can save a large amount of power or reach a longer distance with the same power consumption. Fig. 21(c) shows the TX-RX test demonstration. The LiDAR system was tested in-door on a 15-m automatic rail, with a 500-lux ambient light environment.

Fig. 22 shows the captured depth map and the corresponding RGB picture in the indoor environment with eight TX channels enabled sequentially. The measured distance versus ground

TABLE II  
PERFORMANCE SUMMARY AND COMPARISON OF THIS WORK WITH THE STATE-OF-THE-ART dTOF LiDAR SYSTEMS

| Parameter                              | This Work                                     | [23]           | [21]              | [22]           | [15]              |
|----------------------------------------|-----------------------------------------------|----------------|-------------------|----------------|-------------------|
| Customized ASICs in LIDAR System       | TX + RX                                       | RX             | RX                | RX             | RX                |
| <b>Laser Projection</b>                | <b>Solid-State Scanning + Flash (ASIC-TX)</b> | Flash (PCB-TX) | Scanning (PCB-TX) | Flash (PCB-TX) | Scanning (PCB-TX) |
| <b>TX Automatic Peak Power Control</b> | <b>YES</b>                                    | NO             | NO                | NO             | NO                |
| <b>Wavelength</b>                      | <b>940nm</b>                                  | 671nm          | N/A               | 780nm          | 905nm             |
| <b>Pixel Array</b>                     | <b>128x128</b>                                | 256x256        | 1200x900          | 256x128        | 189x600           |
| <b>TDC Resolution</b>                  | <b>100ps</b>                                  | 35/560ps       | N/A               | 60ps           | 1000ps            |
| <b>Maximum Distance</b>                | <b>15m</b>                                    | 50m            | 250m              | 100m           | 150-200m          |
| <b>Depth Accuracy</b>                  | <b>20cm (15m)</b>                             | 17cm (50m)     | 1.5m (250m)       | 7cm (100m)     | 30cm (200m)       |
| <b>Process Technology</b>              | <b>180nm BCD</b>                              | 40nm/90nm      | 65nm              | 45nm           | 90nm/40nm         |
| <b>Frame Rate</b>                      | <b>1fps</b>                                   | 30fps          | 30fps             | N/A            | 20fps             |
| <b>FoV</b>                             | <b>36°x36°</b>                                | 1.2°x1.2°      | N/A               | 2.0°x2.0°      | 25.2°x9.45°       |
| <b>Peak Optical Emission Power</b>     | <b>20W</b>                                    | 9.5W           | N/A               | N/A            | 45W               |
| <b>Average Optical Emission Power</b>  | <b>150mW</b>                                  | 1.8mW          | N/A               | 1.5mW          | N/A               |
| <b>Pulse Repetition Rate</b>           | <b>5MHz</b>                                   | 1.9MHz         | N/A               | 0.5MHz         | N/A               |
| <b>Pixel Fill-Factor</b>               | <b>18%</b>                                    | 51%            | N/A               | N/A            | N/A               |
| <b>PDP @ VE</b>                        | <b>1% @ 1.8V</b>                              | 23% @ 3V       | N/A               | N/A            | 22%               |
| <b>TDC Linearity</b>                   | <b>+0.6LSB/-0.7LSB</b>                        | +0.05/-0.05LSB | N/A               | +0.05/-0.05LSB | N/A               |
| <b>DNL/INL</b>                         | <b>+1.4LSB/-1.3LSB</b>                        | +0.1/-0.08LSB  |                   | +0.1/-0.1LSB   |                   |
| <b>Power Consumption</b>               | <b>37mW</b>                                   | 77.6mW         | 2500mW            | 51.9mW         | 1192mW            |



Fig. 22. Measured system depth image, distance versus ground truth, and depth accuracy.

TABLE III  
PERFORMANCE SUMMARY OF THE TX

| ASIC-TX Performance Summary                             |                                         |
|---------------------------------------------------------|-----------------------------------------|
| Item                                                    | Chip Performance Parameter              |
| Process Technology                                      | 180nm BCD                               |
| Supply Voltage                                          | 5.0~12.5V(LDD), 3.3V(analog), 1.8V(I/O) |
| Channel Number                                          | 1 to 12                                 |
| Peak Optical Power                                      | 20.35 W @ 12.5V                         |
| Pulse Width Tuning Range                                | 360 to 2060 ps                          |
| Pulse Jitter                                            | 11 ps[RMS]                              |
| Trise                                                   | 90ps @ 16.20 W                          |
| Tfall                                                   | 145ps @ 16.20 W                         |
| On-chip Oscilloscope Bandwidth                          | 800MHz                                  |
| Chip Size                                               | 4.2mm x 3.2mm                           |
| Total Electrical to Optical Power Conversion Efficiency | 21.6% @ 16.20W, 950ps pulse             |

truth along with the accuracy is also shown in Fig. 22. As can be seen from the graph, the measured depth error is less than 20 cm within 15-m distance for both 10% and 50% reflectivity. The SPAD device is not well-optimized, limiting the ranging distance of the LiDAR system. A new version is



Fig. 23. Power breakdown of the laser TX and the SPAD RX.

under development for ranging distance extension in future work. The power breakdown of the laser TX and the SPAD RX are shown in Fig. 23.

Table III shows a performance summary of the laser diode ASIC-TX. Table II shows a performance summary of the SPAD sensor chip and the comparison with the state-of-the-art dToF LiDAR systems. Compared with previous works, this work proposes a new approach of solid-state MC dToF illumination, an OCOO for dToF pulse observation, and SNR-based SPAD pixel binning with an ML-based upscaling algorithm for resolution recovery.

## V. CONCLUSION

A solid-state MC dToF LiDAR system is presented. Custom MC LDD ASIC and SPAD depth sensor are implemented. The driver chip was fabricated in the 180-nm BCD process, measuring 4.2 × 3.2 mm and capable of generating 20-W/ch peak optical power with a multi-junction VCSEL array. The sensor chip with an integrated 128 × 128 SPAD array was also implemented in the 180-nm BCD process measuring 6.5 × 5.3 mm. In the driver, an OCOO is proposed for optical pulse monitoring and laser safety protection. In the sensor,

configurable pixel binning at the hardware level is proposed for low SNR conditions. An ML-based spatial upscaling algorithm is adopted during data post-processing by the software. With this software–hardware co-optimization scheme, HR and high-SNR depth images can be constructed with lower power.

### ACKNOWLEDGMENT

The authors acknowledge the great efforts of the Layout Team, DVT Team, FAE Team, and QRE Team, of PhotonIC Technologies Inc. in building this system.

### REFERENCES

- [1] S. Foix, G. Alenya, and C. Torras, “Lock-in time-of-flight (ToF) cameras: A survey,” *IEEE Sensors J.*, vol. 11, no. 9, pp. 1917–1926, Sep. 2011, doi: [10.1109/JSEN.2010.2101060](https://doi.org/10.1109/JSEN.2010.2101060).
- [2] J. Smisek, M. Jancosek, and T. Pajdla, “3D with Kinect,” in *Proc. IEEE Int. Conf. Comput. Vis. Workshops (ICCV Workshops)*, Feb. 2011, pp. 1154–1160.
- [3] M. Van den Berg and L. Van Gool, “Combining RGB and ToF cameras for real-time 3D hand gesture interaction,” in *Proc. IEEE Workshop Appl. Comput. Vis. (WACV)*, Jan. 2011, pp. 66–72.
- [4] Y. Kato et al., “320 × 240 back-illuminated 10-μm CAPD pixels for high-speed modulation time-of-flight CMOS image sensor,” *IEEE J. Solid-State Circuits*, vol. 53, no. 4, pp. 1071–1078, Apr. 2018, doi: [10.1109/JSSC.2018.2789403](https://doi.org/10.1109/JSSC.2018.2789403).
- [5] D. Kim et al., “5.4 A dynamic pseudo 4-tap CMOS time-of-flight image sensor with motion artifact suppression and background light cancelling over 120klux,” in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2020, pp. 100–102.
- [6] M.-S. Keel et al., “7.1 A 4-tap 3.5μm 1.2 mpixel indirect time-of-flight CMOS image sensor with peak current mitigation and multi-user interference cancellation,” in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2021, pp. 106–108.
- [7] Y. Kato et al., “320×240 back-illuminated 10 μm CAPD pixels for high speed modulation time-of-flight CMOS image sensor,” in *Proc. Symp. VLSI Circuits*, Jun. 2017, pp. 2858–2864.
- [8] C. Bamji et al., “A review of indirect time-of-flight technologies,” *IEEE Trans. Electron Devices*, vol. 69, no. 6, pp. 2779–2793, Jun. 2022, doi: [10.1109/TED.2022.3145762](https://doi.org/10.1109/TED.2022.3145762).
- [9] M. Perenzoni, D. Perenzoni, and D. Stoppa, “6.5 A 64×64-pixel digital silicon photomultiplier direct ToF sensor with 100Mphotons/s/pixel background rejection and imaging/altimeter mode with 0.14% precision up to 6km for spacecraft navigation and landing,” in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Jan. 2016, pp. 118–119.
- [10] C. Niclass, M. Soga, H. Matsubara, M. Ogawa, and M. Kagami, “A 0.18μm CMOS SoC for a 100m-range 10fps 200×96-pixel time-of-flight depth sensor,” in *Proc. IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers*, Feb. 2013, pp. 488–489, doi: [10.1109/ISSCC.2013.6487827](https://doi.org/10.1109/ISSCC.2013.6487827).
- [11] C. Zhang, S. Lindner, I. M. Antolovic, J. M. Pavia, M. Wolf, and E. Charbon, “A 30-frames/s, 252×144 SPAD flash LiDAR with 1728 dual-clock 48.8-ps TDCs, and pixel-wise integrated histogramming,” *IEEE J. Solid-State Circuits*, vol. 54, no. 4, pp. 1137–1151, Apr. 2019, doi: [10.1109/JSSC.2018.2883720](https://doi.org/10.1109/JSSC.2018.2883720).
- [12] A. R. Ximenes, P. Padmanabhan, M.-J. Lee, Y. Yamashita, D. N. Yaung, and E. Charbon, “A 256×256 45/65 nm 3D-stacked SPAD-based direct TOF image sensor for LiDAR applications with optical polar modulation for up to 18.6dB interference suppression,” in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2018, pp. 96–98.
- [13] S. W. Hutchings et al., “A reconfigurable 3-D-stacked SPAD imager with in-pixel histogramming for flash LiDAR or high-speed time-of-flight imaging,” *IEEE J. Solid-State Circuits*, vol. 54, no. 11, pp. 2947–2956, Nov. 2019, doi: [10.1109/JSSC.2019.2939083](https://doi.org/10.1109/JSSC.2019.2939083).
- [14] R. Halterman and M. Bruch, “Velodyne HDL-64E LiDAR for unmanned surface vehicle obstacle detection,” *Proc. SPIE*, vol. 7692, p. 9, Apr. 2010.
- [15] O. Kumagai et al., “7.3 A 189×600 back-illuminated stacked SPAD direct time-of-flight depth sensor for automotive LiDAR systems,” in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, vol. 64, Feb. 2021, pp. 110–112.
- [16] C.-H. Lin, H.-S. Zhang, C.-P. Lin, and G.-D.-J. Su, “Design and realization of wide field-of-view 3D MEMS LiDAR,” *IEEE Sensors J.*, vol. 22, no. 1, pp. 115–120, Jan. 2022, doi: [10.1109/jsen.2021.3127045](https://doi.org/10.1109/jsen.2021.3127045).
- [17] A. Srowik, “256×16 SPAD array and 16-channel ultrashort pulsed laser driver for automotive LiDAR,” in *Proc. Int. SPAD Workshop*, 2020, pp. 18–27.
- [18] U. Kabuk, “4D solid-state LiDAR,” in *Proc. Int. SPAD Workshop (ISSW)*, 2020.
- [19] C. Rogers et al., “A universal 3D imaging sensor on a silicon photonics platform,” *Nature*, vol. 590, no. 7845, pp. 256–261, Feb. 2021.
- [20] C. Niclass, M. Soga, H. Matsubara, S. Kato, and M. Kagami, “A 100-m range 10-frame/s 340×96-pixel time-of-flight depth sensor in 0.18-μm CMOS,” *IEEE J. Solid-State Circuits*, vol. 48, no. 2, pp. 559–572, Feb. 2013, doi: [10.1109/JSSC.2012.2227607](https://doi.org/10.1109/JSSC.2012.2227607).
- [21] T. Okino et al., “5.2 A 1200×900 6μm 450fps geiger-mode vertical avalanche photodiodes CMOS image sensor for a 250m time-of-flight ranging system using direct-indirect-mixed frame synthesis with configurable-depth-resolution down to 10 cm,” in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2020, pp. 96–98.
- [22] P. Padmanabhan et al., “7.4 A 256×128 3D-stacked (45 nm) SPAD flash LiDAR with 7-Level coincidence detection and progressive gating for 100m range and 10klux background light,” in *Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC)*, Feb. 2021, pp. 111–113.
- [23] R. K. Henderson et al., “5.7 A 256×256 40 nm/90 nm CMOS 3D-stacked 120dB dynamic-range reconfigurable time-resolved SPAD imager,” in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2019, pp. 106–108.
- [24] A. T. Erdogan et al., “A high dynamic range 128×120 3-D stacked CMOS SPAD image sensor SoC for fluorescence microendoscopy,” *IEEE J. Solid-State Circuits*, vol. 57, no. 6, pp. 1649–1660, Jun. 2022, doi: [10.1109/JSSC.2022.3150721](https://doi.org/10.1109/JSSC.2022.3150721).
- [25] T. Al Abbas et al., “A 128×120 5-wire 1.96 mm<sup>2</sup> 40 nm/90 nm 3D stacked SPAD time resolved image sensor SoC for microendoscopy,” in *Proc. Symp. VLSI Circuits*, Jun. 2019, pp. C260–C261.
- [26] J. Park, H. Kim, Y.-W. Tai, M. S. Brown, and I. Kweon, “High quality depth map upsampling for 3D-TOF cameras,” in *Proc. Int. Conf. Comput. Vis.*, Nov. 2011, pp. 1623–1630.
- [27] I. Gyongy et al., “High-speed 3D sensing via hybrid-mode imaging and guided upsampling,” *Optica*, vol. 7, no. 10, pp. 1253–1260, Oct. 2020. [Online]. Available: <http://opg.optica.org/optica/abstract.cfm?URI=optica-7-10-1253>
- [28] C. Guo, C. Li, J. Guo, R. Cong, H. Fu, and P. Han, “Hierarchical features driven residual learning for depth map super-resolution,” *IEEE Trans. Image Process.*, vol. 28, no. 5, pp. 2545–2557, May 2019, doi: [10.1109/TIP.2018.2887029](https://doi.org/10.1109/TIP.2018.2887029).
- [29] A. Ruet, S. McLaughlin, R. K. Henderson, I. Gyongy, A. Halimi, and J. Leach, “Robust super-resolution depth imaging via a multi-feature fusion deep network,” *Opt. Exp.*, vol. 29, no. 8, pp. 11917–11937, Apr. 2021. [Online]. Available: <http://opg.optica.org/oe/abstract.cfm?URI=oe-29-8-11917>, doi: [10.1364/OE.415563](https://doi.org/10.1364/OE.415563).
- [30] J. Kostamovaara, S. Jahromi, L. Hallman, G. Duan, J.-P. Jansson, and P. Keranen, “Solid-state pulsed time-of-flight 3-D range imaging using CMOS SPAD focal plane array receiver and block-based illumination techniques,” *IEEE Photon. J.*, vol. 14, no. 2, pp. 1–11, Apr. 2022, doi: [10.1109/JPHOT.2022.3153487](https://doi.org/10.1109/JPHOT.2022.3153487).
- [31] X. Chen, Y. Wang, Y. Li, T. Xia, Y. Wu, and P. Jiang, “Laser pulse sampling and detecting circuit, system, and method,” U.S. Patent 11125882B1, 2021.
- [32] G. Blasco, D. Dörich, H. Reh, R. Burkard, E. Isern, and E. Martin, “A Sub-ns integrated CMOS laser driver with configurable laser pulses for time-of-flight applications,” *IEEE Sensors J.*, vol. 18, no. 16, pp. 6547–6556, Aug. 2018, doi: [10.1109/JSEN.2018.2850742](https://doi.org/10.1109/JSEN.2018.2850742).
- [33] G. Blasco, D. Dorich, E. Isern, R. Burkard, and E. Martin, “An 8 A, 2 to 25 ns configurable pulse-width integrated CMOS pulsed laser driver with on-chip mounted laser diode,” in *Proc. IEEE Int. Symp. Circuits Syst. (ISCAS)*, Oct. 2020, pp. 1–5.
- [34] T. Xia et al., “An integrated 8A pulsed VCSEL array driver under 12V supply with built-in pulse monitor and automatic peak current control for direct time-of-flight applications,” in *Proc. IEEE Asian Solid-State Circuits Conf. (A-SSCC)*, Nov. 2021, pp. 1–3.
- [35] T. Xia et al., “An 8A sub-1ns pulsed VCSEL driver IC with built-in pulse monitor and automatic peak current control for direct time-of-flight applications,” *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 69, no. 11, pp. 4193–4197, Nov. 2022, doi: [10.1109/TCSII.2022.3186542](https://doi.org/10.1109/TCSII.2022.3186542).
- [36] C.-K. Seong, J. Rhim, and W.-Y. Choi, “A 10-Gb/s adaptive look-ahead decision feedback equalizer with an eye-opening monitor,” *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 59, no. 4, pp. 209–213, Apr. 2012, doi: [10.1109/TCSII.2012.2186366](https://doi.org/10.1109/TCSII.2012.2186366).

- [37] B. Dehlaghi, S. Magierowski, and L. Belostotski, "A 12.5-Gb/s on-chip oscilloscope to measure eye diagrams and jitter histograms of high-speed signals," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 22, no. 5, pp. 1127–1137, May 2014, doi: 10.1109/TVLSI.2013.2265895.
- [38] H. Won et al., "A 28-Gb/s receiver with self-contained adaptive equalization and sampling point control using stochastic sigma-tracking eye-opening monitor," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 64, no. 3, pp. 664–674, Mar. 2017, doi: 10.1109/TCSI.2016.2614349.
- [39] M. Takamiya, M. Mizuno, and K. Nakamura, "An on-chip 100 GHz-sampling rate 8-channel sampling oscilloscope with embedded sampling clock generator," in *Proc. IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers*, vol. 1, Feb. 2002, pp. 182–458.
- [40] M. Safi-Harb and G. W. Roberts, "70-GHz effective sampling time-base on-chip oscilloscope in CMOS," *IEEE J. Solid-State Circuits*, vol. 42, no. 8, pp. 1743–1757, Aug. 2007, doi: 10.1109/JSSC.2007.900292.
- [41] H. Choi, A. V. Gomes, and A. Chatterjee, "Signal acquisition of high-speed periodic signals using incoherent sub-sampling and back-end signal reconstruction algorithms," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 19, no. 7, pp. 1125–1135, Jul. 2011, doi: 10.1109/TVLSI.2010.2048135.
- [42] J. Kostamovaara, S. S. Jahromi, and P. Keränen, "Temporal and spatial focusing in SPAD-based solid-state pulsed time-of-flight laser range imaging," *Sensors*, vol. 20, no. 21, p. 5973, Oct. 2020, [Online]. Available: <https://www.mdpi.com/1424-8220/20/21/5973>, doi: 10.3390/s20215973.
- [43] S. Jahromi, J.-P. Jansson, P. Keranen, and J. Kostamovaara, "A 32 × 128 SPAD-257 TDC receiver IC for pulsed TOF solid-state 3-D imaging," *IEEE J. Solid-State Circuits*, vol. 55, no. 7, pp. 1960–1970, Jul. 2020, doi: 10.1109/JSSC.2020.2970704.
- [44] A. Rochas et al., "First fully integrated 2-D array of single-photon detectors in standard CMOS technology," *IEEE Photon. Technol. Lett.*, vol. 15, no. 7, pp. 963–965, Jul. 2003, doi: 10.1109/LPT.2003.813387.
- [45] D. J. Butler, J. Wulff, G. B. Stanley, and M. J. Black, *A Naturalistic Open-Source Movie for Optical Flow Evaluation*. Berlin, Germany: Springer, 2012.
- [46] D. Scharstein and C. Pal, "Learning conditional random fields for stereo," in *Proc. IEEE Conf. Comput. Vis. Pattern Recognit.*, Jun. 2007, pp. 1–8.



**Shenglong Zhuo** (Member, IEEE) received the B.S. degree from the Nanjing University of Post and Telecommunications, Nanjing, China, in 2010, and the M.S. degree from Southeast University, Nanjing, in 2013. He is currently pursuing the Ph.D. degree in microelectronics with the State Key Laboratory of Application-Specific Integrated Circuit (ASIC) and System, Fudan University, Shanghai, China.

From 2013 to 2017, he was a Power-Integrated Circuit (IC) Analog Design Engineer with Silergy Corporation, Nanjing. Since 2018, he has been working as an Analog Mixed-Signal IC Design Engineer with PhotonIC Technologies Inc., Shanghai. His research interests include high-speed and high-power mixed-signal circuit design, single-photon avalanche diode (SPAD)-based optical sensor circuits and systems, and neural network accelerators for various sensor applications.



**Tao Xia** received the B.S. and M.S. degrees in microelectronics from Peking University, Beijing, China, in 2011 and 2014, respectively. He is currently pursuing the Ph.D. degree in microelectronics with the State Key Laboratory of Application-Specific Integrated Circuit (ASIC) and System, Fudan University, Shanghai, China.

From 2014 to 2016, he was an RF-Integrated Circuit (IC) Design Engineer with Spreadtrum Communications (now UNISOC), Shanghai. Since 2016, he has been working as an Analog Mixed-Signal IC Design Engineer with PhotonIC Technologies Inc., Shanghai. His research interests include high-speed wireline transceiver ICs and high-power high-speed mixed-signal laser driver ICs.



**Lei Zhao** received the B.S. degree from the Huazhong University of Science and Technology, Wuhan, China, in 2014, and the M.S. degree from the Institute of Microelectronics, Chinese Academy of Sciences, Beijing, China, in 2017. He is currently pursuing the Ph.D. degree in microelectronics with Fudan University, Shanghai, China.

His research interests include optical transceivers and time-of-flight (ToF) light detection and ranging (LiDAR) sensors.



**Miao Sun** received the B.S. degree in electronics science and technology from Southwest University, Chongqing, China, in 2018. She is currently pursuing the Ph.D. degree with Fudan University, Shanghai, China, with a focus on neural network design and hardware implementation for 3-D AI accelerators.

Her current research interests include light detection and ranging (LiDAR) time-of-flight-related neural network algorithm and accelerator system-on-chip (SoC) system design.



working with PhotonIC Technologies Inc., Shanghai, on direct time-of-flight (dToF) and sensor systems.

**Yifan Wu** received the B.S. degree in communication engineering from the Wuhan University of Technology, Wuhan, China, in 2011, and the joint M.S. degree from Nanyang Technological University, Singapore and the Technical University of Munich, Munich, Germany, in 2013. He is currently pursuing the Ph.D. degree with Tongji University, Shanghai, China.

From 2012 to 2018, he was a Senior Digital Design Engineer with Infineon Technologies, Singapore, in the automotive area. Since 2018, he has been



working with PhotonIC Technologies Inc., Shanghai, on direct time-of-flight (dToF) and sensor systems.

**Hengwei Yu** received the B.S. degree in electronic information engineering from Jianghan University, Wuhan, China, in 2016, and the M.S. degree in integrated circuit (IC) engineering from the University of Science and Technology of China, Hefei, China, in 2019. He is currently pursuing the Ph.D. degree with Fudan University, Shanghai, China.

Since 2019, he has been an Intern with PhotonIC Technologies Inc., Shanghai, focusing on direct time-of-flight (dToF) system design.



**Jiqing Xu** received the B.S. degree from Tongji University, Shanghai, China, in 2017, and the M.S. degree from Fudan University, Shanghai, in 2020, where he is currently pursuing the Ph.D. degree.

Since 2020, he has been an Intern with PhotonIC Technologies Inc., Shanghai, focusing on direct time-of-flight (dToF) and sensor systems.





**Jier Wang** received the B.S. degree from the Xi'an University of Architecture and Technology, Xi'an, China, in 2014, and the M.S. degree from the University of Chinese Academy of Sciences, Beijing, China, in 2018. He is currently pursuing the Ph.D. degree with Fudan University, Shanghai, China.

From 2018 to 2021, he was an Optical Engineer with Hamamatsu Photonics, Beijing, and GigaDevice, Shanghai. Since 2021, he has been working with PhotonIC Technologies Inc., Shanghai, focusing on direct time-of-flight (dToF) and optical systems.



**Zhihong Lin** received the B.S. degree from the South China University of Technology, Guangzhou, China, in 2013 and the M.S. degree from Fudan University, Shanghai, China, in 2016, where he is currently pursuing the Ph.D. degree.

From 2016 to 2019, he was with Texas Instruments, Shanghai. His research interests include time-of-flight sensors and ambient-light sensors.



**Yuan Li** received the B.S. and M.S. degrees from East China Normal University, Shanghai, China, in 2015 and 2018, respectively. He is currently pursuing the Ph.D. degree with Fudan University, Shanghai.

From 2018 to 2020, he was a Senior Analog Design Engineer with PhotonIC Technologies Inc., Shanghai. His research interests include time of flight (ToF).



**Lei Qiu** received the B.Sc. and M.Sc. degrees in electrical engineering from Southeast University, Nanjing, China, in 2009 and 2011, respectively, and the Ph.D. degree in electrical and electronics engineering from Nanyang Technological University, Singapore, in 2016.

From July 2015 to July 2018, he was a Design Engineer with Infineon Technologies Asia Pacific Pte Ltd., Singapore. Since September 2018, he has been working as a Professor with the College of Electrical Information and Engineering, Tongji University, Shanghai, China. He is the author of 30 SCI/EI articles and three inventions. His research interests include high-speed high-resolution low-power A/D converters, signal chain circuitry, and lower power receiver.

Dr. Qiu was a recipient of the Student Travel Grant Award by ASSCC in 2016.



**Rui Bai** received the B.S. degree in microelectronics from the University of Electronic Science and Technology of China (UESTC), Chengdu, China, in 2008, and the Ph.D. degree in electrical and computer engineering from Oregon State University, Corvallis, OR, USA, in 2014.

Since 2015, he has been with a startup that he co-founded in Shanghai, China. His current interests include power-efficient high-speed circuits for electrical and optical I/O and single-photon avalanche diode (SPAD)-based optical sensor circuits and systems.

Dr. Bai was a recipient of the Analog Devices Outstanding Student Designer Award in 2014.



**Xuefeng Chen** received the B.S. and M.S. degrees in microelectronics from Fudan University, Shanghai, China, in 2000 and 2003, respectively, and the Ph.D. degree in electrical engineering from Oregon State University, Corvallis, OR, USA, in 2007.

From 2007 to 2016, he was one of the Core Technical Leads with MaxLinear, Carlsbad, CA, USA, a successful fabless semiconductor company that designs chips for broadband radio and infrastructure applications. He is also an expert in the design of high-performance frequency synthesizers and high-speed links. Since 2016, he has been a Co-Founder with PhotonIC Technologies Inc., Shanghai, a fabless semiconductor company.

Dr. Chen has done outstanding research work on the continuous-time (CT) delta-sigma analog-to-digital converter (ADC) design for a multi-mode wireless handset that has had a substantial impact on the delta-sigma data converter technology.



**Patrick Yin Chiang** (Senior Member, IEEE) received the B.S. degree in electrical engineering and computer science from the University of California at Berkeley, Berkeley, CA, USA, in 1997, and the M.S. and Ph.D. degrees in electrical engineering from Stanford University, Stanford, CA, USA, in 2000 and 2007, respectively.

He was an Associate Professor at Oregon State University, Corvallis, OR, USA. He is currently a Professor with the State Key Laboratory of Application-Specific Integrated Circuit (ASIC) and System, Fudan University, Shanghai, China. He is also the Co-Founder of PhotonIC Technologies Inc., Shanghai, a fabless semiconductor company. His research groups have published more than 150 conference papers/journal articles in the area of energy-efficient circuits and systems: silicon photonics, pulse amplitude modulation, 4 levels (PAM4) wireline, optical transceivers, and 3-D sensing.

Dr. Chiang was a recipient of the Department of Energy Early CAREER and the NSF-CAREER awards for energy-efficient interconnects and near-threshold computing.