

# A 50-Gb/s PAM-4 Silicon-Photonic Transmitter Incorporating Lumped-Segment MZM, Distributed CMOS Driver, and Integrated CDR

Qiwen Liao<sup>ID</sup>, Yuguang Zhang, Siyuan Ma, Lei Wang, Leliang Li<sup>ID</sup>, Guike Li, Zhao Zhang<sup>ID</sup>, Member, IEEE, Jian Liu<sup>ID</sup>, Nanjian Wu<sup>ID</sup>, Member, IEEE, Liyuan Liu<sup>ID</sup>, Member, IEEE, Yong Chen<sup>ID</sup>, Senior Member, IEEE, Xi Xiao<sup>ID</sup>, and Nan Qi<sup>ID</sup>, Member, IEEE

**Abstract**—This article presents a 50-Gb/s optical transmitter (TX), consisting of a 40-nm distributed CMOS driver and a 180-nm silicon-photonic modulator. A lumped-segment Mach-Zehnder modulator (LS-MZM) is developed for high bandwidth (BW) four-level pulse amplitude (PAM-4) modulation. A multi-segment driver with limiting outputs is co-designed, which is distributed into each LS-MZM segment. By grouping these LS-MSM segments in a thermometer code, high-linearity modulation is realized without the need of power-hungry high-swing linear drivers. To improve the optical PAM-4 signal integrity, in-segment multiplexing along with clock phase interpolation is adopted to synchronize the electrical and optical signals across all segments. The hybrid coupling between the driver and modulator is devised to boost the BW of the high-speed data path, while a half-rate clock and data recovery (CDR) circuit is integrated to remove the accumulated jitter. Measurements show that the TX exhibits an extinction ratio (ER) of up to 9.8 dB and a 0.99 ratio of the level mismatch. A figure-of-merit (FoM) of 1.39 pJ/bit/dB corresponds to a 682-mW power, which can be further reduced by 40%, at the cost of a degraded ER of 4 dB.

Manuscript received August 7, 2021; revised October 23, 2021; accepted November 29, 2021. Date of publication December 28, 2021; date of current version February 24, 2022. This article was approved by Associate Editor Farhana Sheikh. This work was supported in part by the National Key Research and Development Program of China under Grant 2019YFB2204702; in part by the National Natural Science Foundation of China under Grant 61874115; in part by the Pioneer Hundred Talents Program, Chinese Academy of Sciences; and in part by the Major Key Project of PCL under Grant PCL2021A14. (*Corresponding authors:* Nan Qi; Xi Xiao.)

Qiwen Liao was with the State Key Laboratory of Superlattices and Microstructures, Institute of Semiconductors, Chinese Academy of Sciences, Beijing 100083, China, and also with the Center of Material Science and Optoelectronics Engineering, University of Chinese Academy of Sciences, Beijing 100049, China. He is now with Changsha Hengding Information Technology Corporation, Changsha 410221, China.

Yuguang Zhang, Lei Wang, and Xi Xiao are with the National Optoelectronics Innovation Center (NOEIC) and the State Key Laboratory of Optical Communication Technologies and Networks, China Information and Communication Technologies Group Corporation (CICT), Wuhan 430074, China, and also with the Peng Cheng Laboratory, Shenzhen 518055, China (e-mail: xxiao@wri.com.cn).

Siyuan Ma, Leliang Li, Guike Li, Zhao Zhang, Jian Liu, Nanjian Wu, Liyuan Liu, and Nan Qi are with the State Key Laboratory of Superlattices and Microstructures, Institute of Semiconductors, Chinese Academy of Sciences, Beijing 100083, China, and also with the Center of Material Science and Optoelectronics Engineering, University of Chinese Academy of Sciences, Beijing 100049, China (e-mail: qinan@semi.ac.cn).

Yong Chen is with the State Key Laboratory of Analog and Mixed-Signal VLSI and IME/ECE-FST, University of Macau, Taipa, Macau.

Color versions of one or more figures in this article are available at <https://doi.org/10.1109/JSSC.2021.3134874>.

Digital Object Identifier 10.1109/JSSC.2021.3134874

The PAM-4 CDR helps to achieve  $<10^{-12}$  BER and  $>0.1\text{-UI}_{\text{pp}}$  jitter tolerance (JTOL) from 10 to 100 MHz.

**Index Terms**—Clock and data recovery (CDR), CMOS, distributed driver, four-level pulse amplitude (PAM-4), Mach-Zehnder modulator (MZM), optical digital-to-analog converter (DAC), silicon photonic (SiPh), transmitter (TX).

## I. INTRODUCTION

THE big data generated from cloud computing and artificial intelligence leads to an ever-increasing demand for the datacenter interconnect bandwidth (BW) [1]. While the data rate exceeds 56 Gb/s, power-hungry equalization and short-reach channel are observed in electrical links [2], [3]. To overcome these issues, optical links are widely used to provide low insertion loss, high BW, and improved power efficiency. In particular, there are applications substantially different to telecom, such as the chip-to-chip interconnects in datacenter switches, high-performance FPGA and GPUs, which require very small form-factors and considerable high volumes. When optical modules are developed from the pluggable to co-packaged form, silicon photonic (SiPh) transceivers attract increasing attention for their large-scale integration capability and potential low cost.

In SiPh transceivers, the choice of the modulator and its driving scenario determine the system architecture and its power consumption. In the case of the co-packaged optics (CPO), three different SiPh modulators are commonly adopted, including the micro-ring modulator (MRM), electro-absorption modulator (EAM), and Mach-Zehnder modulator (MZM). Although the MRM has a compact size and inherent capability of wavelength division multiplexing (WDM), it suffers from an unstable resonance that requires complicated and power-consuming compensation. The EAM is typically implemented in the C-band (1550 nm) other than the O-band (1310 nm) that is commonly applied in datacenters [4]–[12]. In comparison, the MZM provides the most stable optical performances, involving high BW and linearity, but is penalized by a low modulation efficiency. To meet the stringent power and integration requirements in CPO, the MZM topology and driving scenario must be improved significantly.

Due to low modulation efficiency, the traveling-wave (TW) MZM needs a long phase shifter to obtain a sufficient optical



Fig. 1. Comparison of SiPh MZMs. (a) TW-MZM with a single driver. (b) LS-MZM with the distributed driver.

extinction ratio (ER). The electrode is typically implemented in a 2–3 mm length transmission line (T-line) with  $50\ \Omega$  resistive termination [see Fig. 1(a)]. Significant T-line attenuation asks for higher input swing from the driver, e.g., 3–4  $V_{ppd}$  (peak-to-peak, differential swing), which is power-hungry and hard to realize in CMOS circuits [13]–[20]. The lump-segmented (LS) MZM is developed to improve the modulation efficiency. By dividing a long phase shifter into multiple short segments, and mapping the driver slice into each segment, the signal attenuation has been greatly reduced [see Fig. 1(b)] [29], [30], [34]–[36]. Furthermore, each segment shows a small capacitive load to the driver, which is beneficial for low-power operation.

As the per-channel data rate evolves beyond 50 Gb/s, the four-level pulse amplitude modulation (PAM-4) is widely adopted in optical transmitters (TXs). Compared to the non-return-to-zero (NRZ), it provides a doubled effective bit rate at the cost of a 9-dB lower signal-to-noise ratio (SNR). Therefore, high linearity and high ER are of great importance for the PAM-4 modulation. To achieve multi-level modulation by limiting-output drivers, the binary-weighted dual-segment MZM was designed to build an optical digital-to-analog converter (O-DAC) [20], [23]–[31]. Although the need for a high-swing linear driver is eliminated, there are nonlinearity problems in practical realizations. First, the static linearity relies on the binary-weighted segment length, which is deterministic once fabricated, and may not provide an exact 2:1 phase shift [25], [27]. Besides, the top and bottom eye height in the PAM-4 eye diagram cannot be tuned independently [25]. Second, for the multi-segment MZM with TW electrodes, the  $50\ \Omega$  termination is still driven at high power consumption. Third, for the phase shifters in the LS-MZM, their pnp-n junction capacitance is modulated by the driving-voltage, which leads to dynamic nonlinearity, i.e., the asymmetric transition transients [25], [32].

Another design issue of the LS-MZM concerns the velocity mismatch when integrated with distributed drivers. Specifically across all segments, the distributed driver signal must be synchronized to the light traveling in the waveguide. Otherwise, both the inter-symbol interference (ISI) and

PAM-4 eye-distortion would be introduced across all segments [25], [29], [30]. The TW-MZM utilizes a T-line microwave delay to match the propagation. In the LS-MZM, the delay should be precisely generated by driver subcircuits. Prior works have employed the current-starved inverter delay [29], [30], [36], as well as the clock phase-mixer delay [30]. Yet, the limited tuning range is sensitive to process-voltage-temperature (PVT) variations. A robust wide-range phase-controlled technique is demanded.

This article presents a 50-Gb/s PAM-4 SiPh TX incorporating an LS-MZM and a distributed multi-segment driver (DMSD) with integrated PAM-4 clock and data recovery (CDR). The MZM segments are split into three groups and coded in the thermometer format, acting as an O-DAC to improve the modulation efficiency and nonlinear predistortion flexibility. In-segment multiplexing with clock phase interpolator is devised to synchronize the electrical and optical signals for optimized PAM-4 signal integrity. Dual-rail voltage-mode driver with a hybrid output network is proposed to improve the modulation BW and output ER. Experimental results show clear eye diagrams with 9.8-dB ER and 0.99 PAM-4 ratio of level mismatch (RLM), revealing a 1.39-pJ/bit/dB figure-of-merit (FoM). This article is an extension of [37] with enriched contents, including the LS-MZM design considerations, analysis for modulation nonlinearities and their impact on the optical PAM-4 eye diagram, design details of the DMSD, and more measurement results.

The remainder of this article is organized as follows. Section II introduces the TX system design considerations. Section III illustrates the circuit realization of the proposed distributed CMOS driver. Section IV describes the synchronization technique between electrical and optical signaling. Experimental results are demonstrated in Section V, followed by the conclusion of this work in Section VI.

## II. SYSTEM DESIGN CONSIDERATIONS

### A. SiPh LS-MZM

In the electrical-to-optical (E-O) modulation, high optical ER is desirable to improve the SNR, as well as to save the laser power. Fig. 2(a) shows the instant output power ( $P_m$ ) of a generic MZM, when modulated at a phase shift ( $\Delta\phi$ ) with respect to the quadrature bias point. The normalized transfer function is a sinusoidal curve

$$P_m = \frac{1}{2} + \frac{1}{2}\cos(\varphi_b + \Delta\phi) \quad (1)$$

where  $\varphi_b$  denotes the initial phase shift, which is typically set to  $90^\circ$  as the quadrature bias point. The modulated ER is defined as the division of the highest optical power ( $P_{max}$ ) to the lowest power ( $P_{min}$ )

$$ER = 10 * \log \frac{P_{max}}{P_{min}}. \quad (2)$$

When fixing the laser power, larger  $\Delta\phi$  leads to higher ER, which could be realized by higher driving-swing, longer phase shifter, or lower reverse bias on its pnp-n junction. However, these are limited by the CMOS driver output swing, as well as the large capacitive loading under low bias. Thus, the most promising way is to increase the effective length of the MZM.



Fig. 2. Design of the SiPh LS-MZM. (a) PAM-4 modulation with respect to thermometer-code phase shifts. (b) Simulated ER versus modulator length with  $4\text{-}V_{\text{ppd}}$  and  $3\text{-}V_{\text{ppd}}$  driving-swing, respectively. (c) Simulated reverse-biased pn-pn junction capacitance with respect to  $V_{\text{bias}}$  and corresponding measured PAM4 eye diagram. (d) Illustration of the LS-MZM.

Fig. 2(b) shows the output ER concerning the modulator length and driving-swing. Since the driving-swing may shrink to  $3\text{-}V_{\text{ppd}}$  under the heaviest feed-forward equalization (FFE), the phase shifter should reach 2.7 mm for 10-dB ER. In this work, the SiPh MZM is structured as 3-mm effective length with some design margin. It features  $41^\circ/\text{mm}$  modulation efficiency when driven by a  $4\text{-}V_{\text{ppd}}$  swing and a 2-V reverse-bias. The modulation efficiency ( $V_x L$ ) is measured as  $17.5 \text{ V mm}$ . To ensure the effectiveness of the lumped-equivalent, the segment length is limited within 1/10 of the signal wavelength, which is 1.2 mm for a 25-Gbaud/s data stream. In this design, the 3-mm modulator is separated into six  $500\text{-}\mu\text{m}$  length segments. Considering the physical design, if the segments are placed in-line like the TW-MZM, the driver chip would need a matched-size layout, which consumes large power to distribute high-speed signals. Fig. 2(d) depicts the block diagram of the proposed LS-MZM, whose six segments are placed in a zig-zag layout. High-speed electrode pads are placed at one end of each segment, which forms an in-line array located at the edge of the chip. In this way, the driver's footprint only needs to match the total width of the MZM, instead of the total length.

To ensure sufficient BW, the phase shifter's pn-pn junction is reverse-biased in the depletion mode, with a simulated E-O BW higher than 60 GHz. By connecting the P+ doped region to radio frequency electrodes with a common-mode voltage ( $V_{\text{CM}}$ ), while giving the N+ region an external dc bias ( $V_{\text{EXT}}$ ) the reverse bias can be built as  $(V_{\text{EXT}} - V_{\text{CM}})$ . However, the junction capacitance varies with the bias voltage ( $V_{\text{bias}}$ ) [30] and [32], which contributes to dynamic nonlinearity in the PAM-4 eye. As observed in Fig. 2(c), there is about 30% capacitance variation when applying a  $2\text{-}V_{\text{pp}}$  single-ended driving-voltage to the MZM with a 2-V dc bias voltage. The



Fig. 3. Comparison of different LS-MZM topologies. (a) Binary-code topology. (b) Proposed thermometer-code topology.

dual-arm push-pull driving scheme is employed to obtain a high ER and BW [19] and [33]. The MZM initial phase shift is tuned by an on-chip resistive heater. Grating couplers are used as the optical interfaces between the single-mode fiber and on-chip waveguide.

#### B. Thermometer-Coded O-DAC

Fig. 3(a) shows the binary-coded MZM O-DAC with two 2:1 length segments, corresponding to the most significant bit (MSB) and least significant bit (LSB), respectively. Since the physical length is fixed, the phase shift of MSB and LSB



Fig. 4. System block diagram of the proposed SiPh TX.

cannot be tuned flexibly. Furthermore, to obtain higher ER, the MZM initial bias can be shifted away from  $90^\circ$  [26], [33]. A bias offset helps to make the power-level “00” even lower, but resulting in unequal eye heights at the top and bottom. For those distorted PAM-4 eyes, the binary-coded MZM cannot provide a matched calibration.

Fig. 3(b) shows the proposed thermometer-coded MZM O-DAC, consisting of six segments grouped into three bits. The four-level modulation is achieved by accumulating all equal-weighted phase shifts driven by 3-bit NRZ signals. It enables the independent calibration of each sub-eye’s height. Since the driving-swing for each segment is adjustable, the thermometer-coded O-DAC entails an improved linearity calibration flexibility. Apart from the illustrated benefits of the LS-MZM, more segments enable precise nonlinearity pre-distortion and reduce the capacitive load for each driver. Furthermore, half of the segments can be disabled to cut power consumption while maintaining a PAM-4 optical output with a reasonable ER. Yet, too many segments result in extra power consumption due to the pre-driver, retimer, and MUX in every segment. Apart from that, increased segment quantity leads to area waste for SiPh chip owing to increased silicon waveguide routing and power waste for driver chip with ultra-long clock and data distribution. Moreover, with more segments, a much wider delay adjusting range is required to compensate for the E-O velocity match, out of the coverage capability of the topology, described in Section IV. Thus, the choice of the dual-segment per group is a compromise of the performance, power consumption, area, and E-O velocity match.

### C. MZM Driver With CDR

As depicted in Fig. 4, the proposed SiPh TX integrates the LS-MZM with the dedicated CMOS driver. Optimized for direct wire-bonding, the driver is co-designed with matched segments and footprints. The layout and channel pitch are carefully coordinated to minimize the bondwire parasitics and electromagnetic (EM) coupling. At a high data rate,



Fig. 5. Block diagram of the PAM-4 CDR.

an integrated CDR is desirable to suppress the input accumulated ISI for a clear-eye transmission [25], [26], [28], [38]. In this work, an integrated PAM-4 CDR retimes the 50-Gb/s data and recovers a 25 GHz differential clock. The CDR also takes charge of the analog to digital conversion, enabling the multi-bit O-DAC driving topology. Considering the E/O synchronization, the digital-assisted clock gating technique is proposed for the clock and data distribution.

As depicted in Fig. 5, a mixed-signal PAM-4 CDR is devised with an analog loop, similar to the previous works [25], [28], [39], and [44]. An analog front-end (AFE) with a continuous-time linear equalizer (CTLE) and variable gain amplifier (VGA) conditions the input PAM-4 signal. Different from the previous work [25], the half-rate bang-bang phase detector (BBPD) is simplified with one single sampler for transition edge extraction. The PAM-4 slicers with adjustable threshold quantize the data into a 3-bit thermometer code and then distribute it to all driver segments. The bang-bang logic makes the lead/lag decision at quarter-rate using all sampled data transitions to obtain the lead/lag information after symmetric transition pairs selection, including  $+3/-3$  to  $-3/+3$  and  $+1/-1$  to  $-1/+1$ . A phase-locked loop is built by the charge pump (CP), 2nd-order loop filter (LPF), voltage-controlled oscillator (VCO), and dividers. The VCO generates the differential clock covering 25–28 GHz band, which is further divided for the BBPD and 2:4 demultiplexer (DeMUX). In addition, the phase interpolator (PI) tunes the clock phase to guarantee robust retiming in the 2:4 DeMUX and optimize BER with the optimum sampling phase. To enable the BER test, the quarter-rate thermometer-coded data is converted to the binary format and output off-chip.

### III. CIRCUIT IMPLEMENTATION

Fig. 6(a) depicts the block diagram of the proposed DMSD. All segments of the distributed driver are divided into three groups, corresponding to the 3-bit thermometer-coded data. The 4-phase clock at 12.5 GHz is distributed along with the half-rate data to each segment. Current-mode logic (CML) and CMOS inverter buffers are utilized, respectively, for the clock and data distribution. The CML is chosen for clock distribution due to its good noise rejection and low transient noise of the power supply, while CMOS is employed for data distribution because of its compatibility with CMOS D-flip-flop (DFF) in



Fig. 6. Implementation of the DMSD. (a) Block diagram. (b) Layout detail.

the retimer, and the low power consumption. In each driver segment, the data is retimed and then multiplexed (MUX) to 25 Gb/s full rate. The sampling clock phase is carefully manipulated with the aid of the digital PI. A push-pull stacked output stage drives the LS-MZM at both the cathode and anode nodes. A hybrid output network is employed for the adjustable output common-mode and BW enhancement. All the above circuits are identical in every driver segment.

#### A. In-Segment Retiming and Multiplexing

A pair of NRZ data at half-rate is distributed to each driver segment. They are first retimed by a DFF to remove the accumulated jitter caused by the data distribution. The sampling clock is generated by a 4-bit current-summation PI with a 1.33-ps resolution on average [25]. The retimed data is then multiplexed to two 25-Gb/s data streams with a 40-ps time interval, which is later used for the 2-tap FFE in the output stage. The 25-Gb/s data travels through the single-to-differential converter, the equal-length buffer chain, and finally arrives at the pseudo-differential driver [see Fig. 6(b)]. The buffer chain consists of the cascaded inverters and near 250-μm metal trace. In the pseudo-differential driver, the 2-V supply is divided into two stacked power domains, i.e.,

0–1 V and 1–2 V. The compact latch-based level-shifter is used for the voltage-domain transformation [see Fig. 6(a)] [9], [29], [46]. Apart from a weak inverter on the feedback path, the dc-blocking capacitance and serial resistance are 120 fF and 10 kΩ allowing for a PRBS-31 pattern at 25 Gb/s. A 30-Ω damping resistor is added between the output stage and test pad to eliminate over-peaking resulting from the parasitic inductance of the wire bond. At last, the segment-enable control is integrated, which configures between the NRZ or PAM-4 modulation, as well as switching the driver to the low-power mode by cutting off half of the segments. To match the driver output with the light traveling across the LS-MZM, a controlled delay is demanded between every two neighboring segments. Besides the fixed distribution delay, the data is retimed in each segment by a dedicated clock phase. The phase difference in two segments defines the relative delay.

#### B. Dual-Rail Output Stage

The LS-MZM segment shows a lumped-capacitor load to the driver, which demands high driving-swing and fast transients. Compared to CML drivers, the voltage-mode driver provides a rail-to-rail voltage swing without static current dissipation. Fig. 7(a) shows the stacked inverter and the push-pull stacked inverter with a single pnp-n junction as load [4], [29]. When driving a capacitive load, the push-pull stacked inverter is capable to deliver  $4 \times VDD$  differential effective swing. Compared to the  $2 \times VDD$  swing of the stacked inverter [29], [36], the push-pull stacked inverter doubles the output swing, which is preferred for high-ER modulation. However, the doubled voltage swing may overstress the cascoded transistor, which is caused by the different transients of its drain and source nodes [9], [46].

In this design, the driver output stage adopts the push-pull stacked topology [see Fig. 7(c)]. The data-dependent pulse generator is inserted to generate narrow pulses to prevent from overstressing. To adjust the output swing, 4-bit output-swing control circuits are implemented [see Fig. 7(c)]. Herein, the 2-tap FFE compensates for the output channel loss, such as the bondwire inductor and the modulator electrode. Swing control realizes the optical eye-height adjustment for high-linearity PAM-4 modulation. By adding the NMOS for the current discharge on the current charge path and the PMOS for the current charge on the current discharge path, the driver output swing can be changed. The difference between the EQ and swing control circuits is high-speed input data, which are 1-UI delayed, like  $\overline{IN\_H}$ ,  $\overline{IN\_H_p}$ , and  $\overline{IN\_L}$ ,  $\overline{IN\_L_p}$ . Fig. 7(d) shows the simulated waveforms of the 2-tap FFE and output swing adjustments at 25 Gb/s, which achieve up to 43% over-peaking and 20% swing reduction.

#### C. Hybrid Output Network

LS-MZM phase shifters need to be biased in the depletion region for high modulation efficiency and high BW, which is sensitive to the bias voltage on the pnp-n junction. When the driver outputs are dc-coupled to the modulator, the pnp-n junction shares the same common mode with the driver.



Fig. 7. Output stage: (a) reference topology, (b) employment in the DMSD, (c) single-ended schematic, and (d) simulated FFE and swing adjustment.



Fig. 8. Output network: (a) dc-coupled equivalent circuit, (b) ac-coupled equivalent circuit, (c) reference device value, and (d) simulated gain and BW boosting against the ac-coupling capacitance.

The reverse-bias voltage is limited and cannot be adjusted flexibly. However, different segments may need various bias voltages to realize the same performance due to the process variation. The capacitive ac-coupling enables custom output common mode to adjust the reverse-biased MRM [9] and LS-MZM. In addition, an ac-coupled output network provides a high-pass frequency response, which can be used to boost the BW.

Herein, the hybrid-coupling network is employed in each driver group, which consists of a dc-coupled segment and an ac-coupled segment [see Fig. 6(a)]. Different from the previous work [47], the two paths are finally combined in the optical domain. The equivalent schematic of the dc and ac-coupled output network is shown in Fig. 8(a)–(c). Pad parasitic capacitors of the driver and modulator ( $C_{pad1}$  and  $C_{pad2}$ ), the bondwire inductor ( $L_{pkg}$ ), and the damping resistor ( $R_d$ ) are taken into consideration. Fig. 8(d) shows the simulated gain and BW boosting against the ac-coupling capacitor ( $C_{AC}$ ). Insufficient capacitance introduces higher BW boosting but

a lower driving-swing on the MZM. There is a compromise for the choice of  $C_{AC}$  between BW boosting and modulation efficiency. Fig. 9 compares the simulated output eye diagrams across the dc, ac, and hybrid coupled network with the PRBS-7 data. In the ac-coupled case, there is increased jitter caused by dc wandering, leading to 860-fs peak-to-peak jitter and 16.3-ps rising time (10%–90%). By combining the dc and ac paths, the jitter is decreased to 293 fs with an improved 18-ps rising time. In summary, the dc wandering is partially solved with the dc-coupled path, while the overall driving-signal BW is boosted with the ac-coupled path.

#### IV. VELOCITY MATCH AND CALIBRATION

In segmented MZMs, the E-O velocity mismatch leads to signal integrity degradation, including the eye-height reduction and PAM-4 eye diagram skew [25], [30]. To clarify the impact, a 1st-order simplified equation is used to describe the eye-height reduction versus the E-O phase mismatch. For the data



Fig. 9. Simulated 25-Gb/s driver output eye diagrams with (a) the dc-coupled output network, (b) the ac-coupled network, and (c) the hybrid output network.

pattern “010,” the short-bit “1” can be simplified as

$$V_d = V_0 \sin wt, \quad (0 < t < 1\text{UI}) \quad (3)$$

where  $V_d$ ,  $V_0$ , and  $w$  denote the driver output, signal amplitude, and the Nyquist frequency, respectively. According to (1), the MZM-modulated optical power is proportional to the modulated phase variation ( $\Delta\varphi$ ). Assuming that the E-O conversion BW is much larger than the Nyquist frequency and neglecting high-order effects in the E-O conversion,  $\Delta\varphi$  is expressed as

$$\Delta\varphi = kV_0 \sin wt, \quad (0 < t < 1\text{UI}) \quad (4)$$

where  $k$  represents for the E-O conversion coefficient. Considering the inter-segment electrical delay and optical propagation delay, the modulated optical power by two MZM segments is described as

$$P_{\text{out}} = \frac{1}{2} + \frac{1}{2} \cos(\varphi_b + \Delta\varphi_{\text{seg1}} + \Delta\varphi_{\text{seg2}}), \quad (5)$$

$$(\varphi_b + \varphi_{\text{seg1}} + \varphi_{\text{seg2}} < \pi)$$

$$\Delta\varphi_{\text{seg1}} + \Delta\varphi_{\text{seg2}} = kV_0 (\sin(wt + \varphi_{\text{opt}}) + \sin(wt + \varphi_e)), \quad (0 < t < 1\text{UI}) \quad (6)$$

$$|\Delta\varphi_{\text{seg1}} + \Delta\varphi_{\text{seg2}}| = kV_0 \sqrt{2 + 2 \cos(\varphi_e - \varphi_{\text{opt}})} \quad (7)$$

where  $P_{\text{out}}$  denotes the normalized optical power of the two-segment modulator. Here,  $\varphi_{\text{seg1}}$  and  $\varphi_{\text{seg2}}$  denote the modulated phase shifts of the segment-1 and segment-2 modulators, respectively.  $\varphi_e$  and  $\varphi_{\text{opt}}$  represent inter-segment electrical transmission and optical propagation delay, respectively. Equation (7) indicates that the E-O velocity mismatch reduces the overall phase shift and thus the output eye height.

In this work, there is a velocity mismatch between the DMSD and the LS-MZM due to a significant gap between the lengths of the clock and data distribution in the DMSD and the silicon waveguide in the LS-MZM. As shown in Fig. 10, there are the CML buffers between adjacent segment drivers in the clock distribution to ensure sufficient clock signal amplitude as well as expected clock delay. The PI-based timing calibration provides a wide-range delay adjustment of 80 ps, and the calculated optical propagation delay from segment-1 to segment-6 is approximately 105 ps, which could not be fully covered. Since the delay of the traveling wave across the 1.8-mm transmission line (T-line) is  $\sim 10$  ps, the



Fig. 10. Timing diagram of the velocity mismatch calibration steps.

CML buffers are inserted between the adjacent segment drivers for the delay compensation with  $\sim 8$ -ps delay for each one. Considering a 10-ps static delay for a clock signal (8 ps of the CML buffer and 2 ps of the traveling wave across the T-line) and a 21-ps optical propagation delay across a segment, the gap of  $\sim 11$ -ps E-O inter-segment delay needs the PI-based tunable delay compensation. In reality, the inter-segment static delay of the buffers in the DMSD will vary as the PVT varies. Thus, the calibration procedure of the optical eye diagram is developed to further optimize the E-O velocity match.

The calibration procedure is performed in four steps:

- Find out the high-SNR sampling region (HSSR) for the in-segment retimer. In each segment, the half-rate data is first retimed to remove jitter. To optimize the retiming clock phase, it is necessary to locate the HSSR (see Fig. 10, in green), which refers to the time window of 1-UI at half-rate excluding the jittery edge. To find the HSSR, all segments but Segment-5 are disabled. By sweeping the PI setting while monitoring the output eye diagram, the HSSR can be determined as the low jittery region. The PI codes of each segment are set in this range in the following steps to eliminate jitter.
- Calibrate the velocity mismatch in each dual-segment group. In the LS-MZM, each dual-segment group



Fig. 11. Chip photo of the hybrid-integrated SiPh TX.

generates an independent NRZ output. The calibration is performed by enabling only one group of the segments. Referring to the analysis in (7), the optical NRZ eye-opening would be maximized when the intra-group velocity match is met. Taking Group-0 for instance, the retiming phase is located in HSSR and the initial PI settings of both segments are the same. Then, start tuning the sampling phase in Segment-2 while keeping Segment-1 fixed, the calibration is completed when the output eye height reaches maximum. Group-1 and Group-2 are calibrated in the same way. After this step, the required delay compensations between the segments in a group are obtained, which would be used in the last step.

- 3) Calibrate the inter-group velocity mismatch. The skew of PAM-4 sub-eyes is removed when inter-group velocity mismatch is suppressed. In this step, only one segment out of each group is enabled since intra-group delay compensations are determined in step 2. For example, Segments-1, 3, and 5 are enabled, while segments-2, 4, and 6 are disabled. First, we set the same initial PI settings of segments-1, 3, and 5 in the range of the HSSR. Then, the PI settings of segments-1 and 5 are fixed and that of segment-3 is swept until the skew between the top and middle eyes in the PAM-4 eye diagram is eliminated. Next, the PI settings of segments-1 and 3 are fixed and that of segment-5 is swept until all eyes align. At the end of the step, inter-group delay compensations are determined.
- 4) Fine-tuning for the velocity match. Based on the previous three-step calibration, the delay compensations between every two segments are certain. Apart from skew elimination and eye-opening maximizing, the settings of FFE and swing adjustment are also tuned to achieve the best PAM4 linearity, which would slightly impact the calibration results. Fine-tuning of the PI is performed to further optimize the output eye diagram with the smallest skew and maximum eye-opening.



Fig. 12. Test setup: (a) principle of the CDR evaluation, (b) principle of the E-O modulation evaluation, and (c) on-site photograph.

## V. MEASUREMENT RESULTS

Fig. 11 shows the hybrid-integrated TX prototype with a CMOS driver and a SiPh modulator. The driver is fabricated in a 40-nm CMOS process with a total area of  $4.9 \text{ mm}^2$ . The modulator is implemented in a 180-nm SiPh process with 0.5-mm length segments. The integrated CDR performance can be evaluated individually as depicted in Fig. 12(a). The test setup of the overall optical TX is detailed in Fig. 12(b) and (c).

In Fig. 12(a), the pulse pattern generator (PPG) outputs 290-mV<sub>ppd</sub> 50-Gb/s PAM-4 PRBS-7 pattern data to the driver chip via both the 20-inch cables and 0.75-inch PCB traces with a 3-dB channel loss at the Nyquist frequency. The retimed 6.25-Gb/s data is fed to the error detector (ED) for the BER test. The phase noise of the recovered 6.25-GHz clock is measured by the spectrum analyzer. Both the retimed data and recovered clock are connected to the sampling oscilloscope for the eye-diagram measurement. The eye diagram of the input single-ended 50-Gb/s PAM-4 signal is displayed in Fig. 13(a). The recovered 6.25-GHz clock exhibits a 745-fs rms jitter, and  $-120.1\text{-dBc/Hz}$  phase noise at a 20-MHz frequency offset, as shown in Fig. 13(b) and (d), respectively. The eye diagram



Fig. 13. Measured (a) 50-Gb/s PAM-4 input, (b) recovered 6.25-GHz clock, (c) recovered 6.25-Gb/s data, (d) phase noise of the recovered clock.



Fig. 14. Measured jitter tolerance curve with BER of  $<10^{-12}$  at 50 Gb/s.

of the recovered data is fully opened, as shown in Fig. 13(c). The PAM-4 BER test is demonstrated with the sequential BER test for the recovered MSB data and LSB data, respectively. The BER of  $<10^{-12}$  is achieved by the manual optimization for the PAM-4 thresholds and sampling clock phase. Fig. 14 plots the measured jitter tolerance (JTOL) [48] curve with a BER of  $<10^{-12}$  at 50 Gb/s. The corner frequency is around 20 MHz with a minimal JTOL of 0.17 UI<sub>pp</sub>, indicating the loop BW matched Fig. 13(d).

As illustrated in Fig. 12(b), the 1550-nm laser is produced by a continuous-wavelength (CW) laser source, amplified by an erbium-doped fiber amplifier (EDFA) with a 19-dBm output optical power, and then polarization-tuned by a polarization controller. The laser is grating-coupled to the silicon waveguide and modulated by a high-speed phase shifter. The modulated laser is sequentially amplified by an EDFA to compensate for the power loss on the optical path, and filtered by a programmable optical filter for noise reduction before being measured by the optical sampling oscilloscope. The arbitrary waveform generator (AWG) outputs the differential 290-mV<sub>pp</sub> 50-Gb/s PAM-4 PRBS-7 data to the driver chip with the same input channel as that in the CDR test [see Fig. 13(a)]. The

pnp-n junction voltage bias and phase bias of the modulator chip are adjusted with the onboard voltage sources. The default pnp-n junction reverse-bias voltage is 2 V with a 3-V off-chip voltage bias on the N-doped side and 1-V voltage bias on the P-doped side determined by driver output common mode. Fig. 12(c) shows the photo of the optical TX test setup. The fiber coupling is realized based on a precise fiber coupling platform.

The optical NRZ modulation is demonstrated simultaneously with each dual-segment group enabled and other groups disabled. The input data of the three dual-segment groups are thermometer-coded, denoted as D2, D1, and D0. Fig. 15(a)–(c) shows the measured 25-Gb/s NRZ eye diagrams with the individually enabled D2, D1, and D0. Due to the thermometer encoding, the three eye diagrams exhibit various level-0/1 possibilities. As shown in Fig. 15(a), the eye diagram with D2 enabled shows a much higher possibility of level-0, which is around 75% for the PRBS-7 pattern. The reverse result is shown in the eye diagram with D0 enabled [see Fig. 15(c)]. With the optimized FFE settings and E-O velocity match, the eye diagrams [see Fig. 15(a)–(c)] achieve a total jitter (TJ) of  $<7.0$  ps and  $>5.1$ -dB ER. Fig. 15(d) shows an eye diagram with D1 enabled and FFE disabled. A comparison between Fig. 15(b) and (d) validates the effective channel-loss compensation of the driver. Generally, the modulator works in default mode with 2-V pnp-n junction reverse bias and phase bias at quadrature for sufficient BW and symmetric eye diagram [see Fig. 15(a)–(d), (g), and (h)]. To achieve a higher ER, the pnp-n junction voltage bias is turned down and the phase bias is shifted away from the quadrature point [see Fig. 15(e) and (f)]. Fig. 15(e) and (f) show the eye diagrams with a single ac-coupled segment and dual segments enabled, respectively. The resulting 4.1-dB ER with a single ac-coupled segment enabled indicates the potential for low-power optical modulation with our TX. Fig. 15(g) and (h) show the measured optical NRZ eye diagrams with the dc-coupled segment enabled only and the ac-coupled segment enabled only, respectively. Setting the same FFE coefficients, the ac-coupled segment exhibits a wider BW than that of the dc-coupled segment with significantly reduced rise/fall time, as shown in Fig. 15 (g) and (h), which verify the effective BW boosting of the ac-coupled output network. Compared to Fig. 15(e), the eye height in Fig. 15(h) is much smaller since the modulator works in default mode.

Compared to the previously published work in [25], we realize a significantly improved modulation efficiency. With the same 1-mm modulator length and 4-V<sub>ppd</sub> drive swing, 6-dB ER is realized in this work despite the adjusted phase bias, as shown in Fig. 15(f), which is much higher than the 3.8-dB ER of Fig. 16(b) in [25]. First of all, there is much higher signal attenuation on TW-MZM in [25] due to the long electrode. In addition, the termination resistance of the modulator in [25] is about differential 60 Ω, which improves the E-O modulation BW but affects the modulation efficiency with degraded drive-swing. Without the termination, there is no drive-swing degradation on the LS-MZM. Therefore, the LS-MZM is advantageous for high modulation efficiency.



Fig. 15. Measured 25-Gb/s optical PRBS-7 NRZ eye diagrams: (a) code D2 enabled only with optimized FFE, (b) code D1 enabled only with optimized FFE, (c) code D0 enabled only with optimized FFE, (d) code D1 enabled only without FFE, (e) single segment (ac-coupled) enabled in Group 1 (see Fig. 4) at high-ER mode, (f) dual segments enabled in Group 1 at high-ER mode, (g) dc-coupled segment enabled only, and (h) ac-coupled segment enabled only.



Fig. 16. Measured 50-Gb/s optical output: (a) without velocity mismatch calibration, (b) without nonlinearity pre-distortion, and (c) with velocity mismatch calibration and nonlinearity pre-distortion.

Fig. 16 shows the measured 50-Gb/s optical PAM-4 eye diagrams when all segments are enabled. Without velocity mismatch calibration, severe skew exists in the PAM-4 eye [see Fig. 16(a)], which is greatly reduced when the calibration is enabled [see Fig. 16(c)]. Regarding the nonlinear distortion [see Fig. 16(b)], the sub-eye heights are calibrated by the driver swing pre-distortion. As a result, the PAM-4 eye RLM is improved from 0.79 to 0.99, which the output ER reaches up to 9.8 dB [see Fig. 16(c)]. Further validation of the O-DAC is carried out by disabling the Group-0 only, which leads to a PAM-3 output shown in Fig. 17(a). The proposed optical TX consumes 912-mW total power in the high-performance PAM-4 mode, of which 682-mW is consumed by the DMSD, and 230-mW is dissipated by the PAM-4 CDR. The power breakdown is listed in Table I. The low-power PAM-4 mode is verified with only one segment enabled in each group. In this mode, approximately 40% power would be saved at the cost of lower ER. As depicted in Fig. 17(b), the clear optical PAM-4 eye diagram is obtained with a reasonable 4.1 dB output ER.

TABLE I  
POWER BREAKDOWN

| Sub-Block          | Power (mW) |
|--------------------|------------|
| CDR                | 230        |
| Clock Distribution | 75.5       |
| PIs                | 64.5       |
| Data Distribution  | 6.1        |
| Retimer and MUX    | 35.9       |
| Driver             | 500        |
| Total              | 912        |

Considering the ER is proportional to drive-swing and the number of enabled segments, it is reasonable to add the ER into the power efficiency evaluation. The FoM is defined as

$$\text{FoM} = \frac{\text{Power}}{\text{Datarate} \cdot \text{ER}}. \quad (8)$$

TABLE II  
PERFORMANCE SUMMARY AND COMPARISON WITH RECENT ARTS

|                              | [20]<br>OFC'2016 | [25]<br>JSSC'2020 | [29]<br>BCICTS'2018 | [33]<br>JSSC'2016 | [35]<br>OFC'2017 | [36]<br>ISSCC'2015 | This work        |
|------------------------------|------------------|-------------------|---------------------|-------------------|------------------|--------------------|------------------|
| Technology                   | 65-nm CMOS       | 40-nm CMOS        | 16-nm FinFET        | 55-nm BiCMOS      | 130-nm BiCMOS    | 65-nm CMOS         | 40-nm CMOS       |
| Integration                  | Wire bond        | Wire bond         | Copper Pillar       | Copper Pillar     | Wire bond        | Copper Pillar      | Wire bond        |
| Modulator Structure          | Segmented,<br>TW | Segmented,<br>TW  | Segmented,<br>LS    | TW                | Segmented,<br>LS | Segmented,<br>LS   | Segmented,<br>LS |
| MZM Length (mm)              | 3                | 3                 | 7                   | 5.34              | 1.8              | 3                  | 3                |
| Modulation Format            | NRZ/<br>PAM-4    | NRZ/<br>PAM-4     | NRZ/<br>PAM-4       | NRZ               | NRZ              | NRZ                | NRZ/<br>PAM-4    |
| Data Rate (Gb/s)             | 50               | 50                | 56                  | 56                | 56               | 25                 | 50               |
| ER (dB)                      | 5.6              | NA                | 9.5                 | 2.5               | <1.8             | 6.0                | 9.8              |
| Power (mW)                   | 613              | 1340*             | 708                 | 300               | 2300             | 275                | 682/912**        |
| Power Efficiency<br>(pJ/bit) | 12.3             | 26.8              | 12.6                | 5.4               | 41.1             | 11                 | 13.64            |
| FoM (pJ/bit/dB)              | 2.19             | N/A               | 1.33                | 2.14              | >22.8            | 1.83               | 1.39             |

\*and \*\* Including power consumption of CDR



Fig. 17. Measured optical PRBS-7 PAM-3/4 eye diagrams with partially enabled segments. (a) PAM-3 with Group 1 and Group 2 segments enabled only (see Fig. 4). (b) PAM-4 with single segment in each group enabled.

With a 9.8-dB ER at 50-Gb/s data rate, the proposed TX achieves 1.39-pJ/bit/dB FoM excluding the CDR. Table II benchmarks the performance of the recently published MZM-based optical TXs with the hybrid-integrated driver. Compared to [20], [25], [33], [35], and [36], this work achieves the highest ER and best FoM, validating the high efficiency of the TX. Reconfigured to low-power mode, a reasonable ER of 4.1 dB is obtained with three segments enabled, which makes sense in the short-reach optical link.

## VI. CONCLUSION

This article has demonstrated a 50-Gb/s PAM-4 SiPh TX with a hybrid-integrated CMOS driver and a SiPh LS-MZM. A high-swing voltage-mode driver is introduced to cooperate with the LS-MZM to effectively improve the power efficiency. The optical-domain PAM-4 scheme and thermometer-coded topology in the DMSD enable high-linearity optical PAM-4 modulation with PI-based velocity calibration for skew elimination. The PAM-4 CDR is inserted for bridging the PAM-4 input data and DMSD with jitter reduction. An ER of 9.8 dB

and RLM of 0.99 have been achieved at 50 Gb/s with a 1.39-pJ/bit/dB FoM. The low-power mode of the TX is also demonstrated with 40% power savings and an ER over 4 dB. A BER of  $<10^{-12}$  and jitter tolerance of  $>0.1 \text{ UI}_{\text{pp}}$  from 10 to 100 MHz are realized for the PAM-4 CDR with a 230-mW power.

## ACKNOWLEDGMENT

The xModel software is provided by Scientific Analog, Inc., and the PeakView Software is provided by Lorentz, Inc.

## REFERENCES

- [1] *Global Cloud Index: Forecast and Methodology 2016–2021*, Cisco, San Jose, CA, USA, 2018.
- [2] J. Kim *et al.*, “A 224 Gb/s DAC-based PAM-4 transmitter with 8-tap FFE in 10 nm CMOS,” in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2021, pp. 126–128.
- [3] M. Choi *et al.*, “An output-bandwidth-optimized 200 Gb/s PAM-4 100 Gb/s NRZ transmitter with 5-tap FFE in 28 nm CMOS,” in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2021, pp. 128–130.
- [4] H. Li *et al.*, “A 3-D-integrated silicon photonic microring-based 112-Gb/s PAM-4 transmitter with nonlinear equalization and thermal control,” *IEEE J. Solid-State Circuits*, vol. 56, no. 1, pp. 19–29, Jan. 2021.
- [5] Y. Zhang *et al.*, “200 Gbit/s optical PAM4 modulation based on silicon microring modulator,” in *Proc. Eur. Conf. Opt. Commun. (ECOC)*, Dec. 2020, pp. 1–4.
- [6] S. Fathololoumi *et al.*, “1.6 Tbps silicon photonics integrated circuit for co-packaged optical-IO switch applications,” in *Proc. Opt. Fiber Commun. Conf. (OFC)*, 2020, pp. 1–3.
- [7] S. Moazeni *et al.*, “A 40-Gb/s PAM-4 transmitter based on a ring-resonator optical DAC in 45-nm SOI CMOS,” *IEEE J. Solid-State Circuits*, vol. 52, no. 12, pp. 3503–3516, Dec. 2017.
- [8] S. Agarwal, M. Ingels, M. Pantouvaki, M. Steyaert, P. Absil, and J. V. Campenhout, “Wavelength locking of a Si ring modulator using an integrated drop-port OMA monitoring circuit,” *IEEE J. Solid-State Circuits*, vol. 51, no. 10, pp. 2328–2344, Oct. 2016.
- [9] H. Li *et al.*, “A 25 Gb/s, 4.4 V-swing, AC-coupled ring modulator-based WDM transmitter with wavelength stabilization in 65 nm CMOS,” *IEEE J. Solid-State Circuits*, vol. 50, no. 12, pp. 3145–3159, Dec. 2015.

- [10] M. Raj *et al.*, "Design of a 50-Gb/s hybrid integrated Si-photonic optical link in 16-nm FinFET," *IEEE J. Solid-State Circuits*, vol. 55, no. 4, pp. 1086–1095, Apr. 2020.
- [11] J. Verbist *et al.*, "Real-time 100 Gb/s NRZ and EDB transmission with a GeSi electroabsorption modulator for short-reach optical interconnects," *J. Lightw. Technol.*, vol. 36, no. 1, pp. 90–96, Jan. 1, 2018.
- [12] S. A. Srinivasan *et al.*, "56 Gb/s germanium waveguide electro-absorption modulator," *J. Lightw. Technol.*, vol. 34, no. 2, pp. 419–424, Jan. 15, 2016.
- [13] A. H. Ahmed, A. E. Moznine, D. Lim, Y. Ma, A. Rylyakov, and S. Shekhar, "A dual-polarization silicon-photonic coherent transmitter supporting 552 Gb/s/wavelength," *IEEE J. Solid-State Circuits*, vol. 55, no. 9, pp. 2597–2608, Sep. 2020.
- [14] O. El-Aassar and G. M. Rebeiz, "A DC-to-108-GHz CMOS SOI distributed power amplifier and modulator driver leveraging multi-drive complementary stacked cells," *IEEE J. Solid-State Circuits*, vol. 54, no. 12, pp. 3437–3451, Dec. 2019.
- [15] L. Szilagyi, R. Henker, D. Harame, and F. Ellinger, "2.2-pJ/bit 30-Gbit/s Mach-Zehnder modulator driver in 22-nm-FDSOI," in *IEEE MTT-S Int. Microw. Symp. Dig.*, Jun. 2018, pp. 1530–1533.
- [16] A. Zandieh, P. Schvan, and S. P. Voinigescu, "Linear large-swing push-pull SiGe BiCMOS drivers for silicon photonics modulators," *IEEE Trans. Microw. Theory Techn.*, vol. 65, no. 12, pp. 5355–5366, Dec. 2017.
- [17] P. Rito *et al.*, "A DC-90-GHz 4-V<sub>pp</sub> modulator driver in a 0.13-μm SiGeC BiCMOS process," *IEEE Trans. Microw. Theory Techn.*, vol. 65, no. 12, pp. 5192–5202, Dec. 2017.
- [18] L. Vera and J. R. Long, "A 40-Gb/s SiGe-BiCMOS MZM driver with 6-V<sub>pp</sub> output and on-chip digital calibration," *IEEE J. Solid-State Circuits*, vol. 52, no. 2, pp. 460–471, Feb. 2017.
- [19] N. Qi *et al.*, "Co-design and demonstration of a 25-Gb/s silicon-photonic Mach-Zehnder modulator with a CMOS-based high-swing driver," *IEEE J. Sel. Topics Quantum Electron.*, vol. 22, no. 6, pp. 131–140, Nov. 2016.
- [20] N. Qi *et al.*, "A 32 Gb/s NRZ, 25 Gbaud/s PAM4 reconfigurable, Si-photonic MZM transmitter in CMOS," in *Proc. Opt. Fiber Commun. Conf. (OFC)*, 2016, pp. 1–3, Paper Th1F.3.
- [21] T. Jyo, M. Nagatani, J. Ozaki, M. Ishikawa, and H. Nosaka, "A 48 GHz BW 225 mW/ch linear driver IC with stacked current-reuse architecture in 65 nm CMOS for beyond-400 Gb/s coherent optical transmitters," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2020, pp. 212–214.
- [22] S. Nakano *et al.*, "A 2.25-mW/Gb/s 80-Gb/s-PAM4 linear driver with a single supply using stacked current-mode architecture in 65-nm CMOS," in *Proc. Symp. VLSI Circuits*, Jun. 2017, pp. C322–C323.
- [23] Y. Sobu, G. Huang, S. Tanaka, Y. Tanaka, Y. Akiyama, and T. Hoshida, "High-speed optical digital-to-analog converter operation of compact two-segment all-silicon Mach-Zehnder modulator," *J. Lightw. Technol.*, vol. 39, no. 4, pp. 1148–1154, Feb. 15, 2021.
- [24] L. Breyné *et al.*, "50 GBd PAM4 transmitter with a 55 nm SiGe BiCMOS driver and silicon photonic segmented MZM," *Opt. Exp.*, vol. 28, no. 16, pp. 23950–23960, 2020.
- [25] Q. Liao *et al.*, "A 50-Gb/s PAM4 Si-photonic transmitter with digital-assisted distributed driver and integrated CDR in 40-nm CMOS," *IEEE J. Solid-State Circuits*, vol. 55, no. 5, pp. 1282–1296, May 2020.
- [26] E. Sentieri *et al.*, "A 4-channel 200 Gb/s PAM-4 BiCMOS transceiver with silicon photonics front-ends for gigabit Ethernet applications," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2020, pp. 210–212.
- [27] A. Samani *et al.*, "Silicon photonic Mach-Zehnder modulator architectures for on chip PAM-4 signal generation," *J. Lightw. Technol.*, vol. 37, no. 13, pp. 2989–2999, Jul. 1, 2019.
- [28] Q. Liao *et al.*, "A dual-28 Gb/s digital-assisted distributed driver with CDR for optical-DAC PAM4 modulation in 40 nm CMOS," in *Proc. IEEE Radio Freq. Integr. Circuits Symp. (RFIC)*, Jun. 2019, pp. 1–4.
- [29] C. Li *et al.*, "A 3D-integrated 56 Gb/s NRZ/PAM4 reconfigurable segmented Mach-Zehnder modulator-based Si-photronics transmitter," in *Proc. IEEE BiCMOS Compound Semiconductor Integr. Circuits Technol. Symp. (BCICTS)*, Oct. 2018, pp. 32–35.
- [30] H. Sepehrian, A. Yekani, L. A. Rusch, and W. Shi, "CMOS-photronics codesign of an integrated DAC-less PAM-4 silicon photonic transmitter," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 63, no. 12, pp. 2158–2168, Dec. 2016.
- [31] C. Xiong *et al.*, "Monolithic 56 Gb/s silicon photonic pulse-amplitude modulation transmitter," *Optica*, vol. 3, no. 10, pp. 1060–1065, 2016.
- [32] S. Yu and T. Chu, "Electrical nonlinearity in silicon modulators based on reversed PN junctions," *Photon. Res.*, vol. 5, no. 2, 2017, Art. no. 2000124.
- [33] E. Temporiti *et al.*, "Insights into silicon photonics Mach-Zehnder-based optical transmitter architectures," *IEEE J. Solid-State Circuits*, vol. 51, no. 12, pp. 3178–3191, Dec. 2016.
- [34] T. N. Huynh *et al.*, "Flexible transmitter employing silicon-segmented Mach-Zehnder modulator with 32-nm CMOS distributed driver," *J. Lightw. Technol.*, vol. 34, no. 22, pp. 5129–5136, Nov. 15, 2016.
- [35] B. G. Lee *et al.*, "Driver-integrated 56-Gb/s segmented electrode silicon Mach-Zehnder modulator using optical-domain equalization," in *Proc. Opt. Fiber Commun. Conf. (OFC)*, 2017, pp. 1–3.
- [36] M. Cignoli *et al.*, "A 1310 nm 3D-integrated silicon photonics Mach-Zehnder-based transmitter with 275 mW multistage CMOS driver achieving 6 dB extinction ratio at 25 Gb/s," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2015, pp. 1–3.
- [37] Q. Liao *et al.*, "A 50 Gb/s high-efficiency Si-photonic transmitter with lump-segmented MZM and integrated PAM4 CDR," in *Proc. IEEE Custom Integr. Circuits Conf. (CICC)*, Apr. 2021, pp. 1–2.
- [38] S. Hu *et al.*, "A 50 Gb/s PAM-4 retimer-CDR + VCSEL driver with asymmetric pulsed pre-emphasis integrated into a single CMOS die," in *Proc. Opt. Fiber Commun. Conf. (OFC)*, 2019, pp. 1–3.
- [39] L. Chang *et al.*, "A 50 Gb/s-PAM4 CDR with on-chip eye opening monitor for reference-level and clock-sampling adaptation," in *Proc. Opt. Fiber Commun. Conf. (OFC)*, 2018, pp. 1–3.
- [40] N. Qi *et al.*, "A 51 Gb/s, 320 mW, PAM4 CDR with baud-rate sampling for high-speed optical interconnects," in *Proc. IEEE Asian Solid-State Circuits Conf. (A-SSCC)*, Nov. 2017, pp. 89–92.
- [41] A. Balachandran, Y. Chen, and C. C. Boon, "A 32-Gb/s 3.53-mW/Gb/s adaptive receiver AFE employing a hybrid CTLE, edge-DFE and merged data-DFE/CDR in 65-nm CMOS," in *Proc. IEEE Asia Pacific Conf. Circuits Syst. (APCCAS)*, Nov. 2019, pp. 221–224.
- [42] X. Zhao, Y. Chen, P.-I. Mak, and R. P. Martins, "A 0.14-to-0.29-pJ/bit 14-GBaud/s trimodal (NRZ/PAM-4/PAM-8) half-rate bang-bang clock and data recovery circuit (BBCDR) in 28-nm CMOS," in *Proc. IEEE Asia Pacific Conf. Circuits Syst. (APCCAS)*, Bangkok, Thailand, Nov. 2019, pp. 229–232.
- [43] X. Zhao, Y. Chen, P.-I. Mak, and R. P. Martins, "A 0.14-to-0.29-pJ/bit 14-GBaud/s trimodal (NRZ/PAM-4/PAM-8) half-rate bang-bang clock and data recovery circuit (BBCDR) in 28-nm CMOS," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 68, no. 1, pp. 89–102, Jan. 2021.
- [44] X. Zhao, Y. Chen, P.-I. Mak, and R. P. Martins, "A 0.0285 mm<sup>2</sup> 0.68 pJ/bit single-loop full-rate bang-bang CDR without reference and separate frequency detector achieving an 8.2(Gb/s)/μs acquisition speed of PAM-4 data in 28 nm CMOS," in *Proc. CICC*, Mar. 2020, pp. 1–4.
- [45] A. Amirkhani, "Basics of clock and data recovery circuits: Exploring high-speed serial links," *IEEE Solid State Circuits Mag.*, vol. 12, no. 1, pp. 25–38, Winter 2020.
- [46] C. Li *et al.*, "Silicon photonic transceiver circuits with microring resonator bias-based wavelength stabilization in 65 nm CMOS," *IEEE J. Solid-State Circuits*, vol. 49, no. 6, pp. 1419–1436, Jun. 2014.
- [47] V. Kozlov and A. C. Carusone, "Capacitively-coupled CMOS VCSEL driver circuits," *IEEE J. Solid-State Circuits*, vol. 51, no. 9, pp. 2077–2090, Sep. 2016.
- [48] X. Ge, Y. Chen, X. Zhao, P.-I. Mak, and R. P. Martins, "Analysis and verification of jitter in bang-bang clock and data recovery circuit with a second-order loop filter," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 27, no. 10, pp. 2223–2236, Oct. 2019.



**Qiwén Liao** received the B.S. degree in electrical engineering from Xi'an Jiaotong University, Xi'an, China, in 2016, and the Ph.D. degree in microelectronics from the University of Chinese Academy of Sciences, Beijing, China, in 2021.

He is currently a Senior Engineer with Changsha Hengding Information Technologies, Changsha, China.

His research interests include high-speed mixed-signal IC design for silicon photonics and electrical/optical transceiver.



**Yuguang Zhang** received the B.S. degree from the Huazhong University of Science and Technology, Wuhan, China, in 2012, and the Ph.D. degree from Zhejiang University, Hangzhou, China, in 2017.

He is currently a Silicon Device Engineer with the National Optoelectronics Innovation Center (NOEIC), Wuhan. He has authored more than 30 publications, including two ECOC post-deadline papers (PDPs). His present research interests include the high-speed electrooptic modulators, silicon-based microwave photonics, and micro-cavities.



**Siyuan Ma** received the B.S. degree in electronic science and engineering from Nanjing University, Nanjing, Jiangsu, China, in 2020. He is currently pursuing the M.S. degree in materials science and opto-electronic technology from the University of Chinese Academy of Sciences, Beijing, China.

His current research interests include high-speed driver IC design for optical modulations and silicon photonics.



**Lei Wang** received the Ph.D. degree in optoelectronic information engineering from the Huazhong University of Science and Technology, Wuhan, China, in 2012.

He is the Manager of the Silicon Photonics Research and Development Department, National Optoelectronics Innovation Center (NOEIC), leading a team of research, test, and product development. He has developed 100G/400G silicon photonics products used in 5G, telecom, and datacom. Prior to joining NOEIC, he was the Director of the Silicon Photonics Group of State key Laboratory of Optical Communication Technologies and Networks, CICT.

**Leliang Li** received the B.S. degree in electronic science and technology from Northeast Petroleum University, Daqing, China, in 2007, and the Ph.D. degree in microelectronics and solid-state electronics from the Institute of Semiconductors, Chinese Academy of Sciences, Beijing, China, in 2014. He was a Lecturer with Gannan Normal University, Ganzhou, China, from 2015 to 2020. He currently holds a post-doctoral position with the Institute of Semiconductors, Chinese Academy of Sciences, Beijing, where he focused on high-speed circuits co-designed with silicon photonics. His research interest includes the design of EPICs.



**Guike Li** was born in Shandong, China, in 1980. He received the B.S. degree in microelectronics from Lanzhou University, Lanzhou, China, in 2005, and the Dr. degree in microelectronics and solid-state electronics from the Institute of Semiconductors, Chinese Academy of Sciences, Beijing, China, in 2011.

Since 2011, he has been an Assistant Professor in microelectronics and solid-state electronics with the State Key Laboratory of Superlattices and Microstructures, Institute of Semiconductors, Chinese Academy of Sciences. His current research interests include CMOS image sensors and integrated silicon photonics.



**Zhao Zhang** (Member, IEEE) received the B.S. degree from the Beijing University of Posts and Telecommunications, Beijing, China, in 2011, and the Ph.D. degree from the Institute of Semiconductors, Chinese Academy of Sciences, Beijing, in 2016.

From 2016 to 2018, he was a Post-Doctoral Fellow with The Hong Kong University of Science and Technology, Hong Kong, working on the design of ultralow-jitter PLLs and PAM-4 CDRs. From 2019 to 2020, he was an Assistant Professor with

Hiroshima University, Higashi-Hiroshima, Japan, where he focused on the low-jitter mm-wave PLL for THz communication. In 2020, he joined the Institute of Semiconductors, Chinese Academy of Sciences, where he is currently a Professor. His research interests include the design of low-jitter and low-power PLLs, RF/mm-wave frequency synthesizers, wireline transceivers, and ultralow power analog and mixed-signal ICs for energy-harvesting applications.



**Jian Liu** was born in Jilin, China, in 1966. He received the B.S. degree in semiconductor physics and devices from Jilin University, Changchun, China, in 1988, the M.Phil. degree in physics from the University of Birmingham, Birmingham, U.K., in 1998, and the Dr. rer. nat. degree in physics from Julius Maximilian Universität Würzburg, Würzburg, Germany, in 2003.

Since 2005, he has been a Full Professor in microelectronics and solid-state electronics with the State Key Laboratory of Superlattices and Microstructures, Institute of Semiconductors, Chinese Academy of Sciences, Beijing, China. His current research interests include semiconductor optoelectronic detectors, terahertz imagers, and metamaterials.



**Nanjian Wu** (Member, IEEE) was born in Zhejiang, China, on February 1961. He received the B.S. degree in physics from Heilongjiang University, Harbin, China, in 1982, the M.S. degree in electronic engineering from Jilin University, Changchun, China, in 1985, and the D.S. degree in electronic engineering from The University of Electro-Communications, Tokyo, Japan, in 1992.

In 1992, he joined the Research Center for Interface Quantum Electronics and Faculty of Engineering, Hokkaido University, Sapporo, Japan, as a Research Associate. In 1998, he was an Associate Professor with the Department of Electro-Communications, The University of Electro-Communications. Since 2000, he has been a Professor with the Institute of Semiconductors, Chinese Academy of Sciences, Beijing, China. In 2005, as a Visiting Professor, he visited the Research Center for Integrated Quantum Electronics, Hokkaido University. Since 2009, he has been an Honorable Guest Professor with the Research Institute of Electronics, Shizuoka University, Shizuoka, Japan. His research is in the field of mixed-signal LSI and vision chip design.



**Liyuan Liu** (Member, IEEE) received the B.S. and Ph.D. degrees in electrical engineering from the Department of Electronic Engineering and the Institute of microelectronics, Tsinghua University, Beijing, China, in 2005 and 2010, respectively.

In 2012, he joined the Institute of Semiconductors, Chinese Academy of Sciences, Beijing, China, as an Associate Professor, where he became a Professor in 2018. His current research interests include mixed-signal IC design, CMOS image sensor design, terahertz image sensor design, and monolithic vision chip design.

Dr. Liu has served as a TPC Member of IEEE Asian Solid State Circuits (A-SSCC) from 2013 to 2018. He also served as the TPC Co-Chair for IEEE International Conference on Integrated Circuits Technologies and Applications (ICTA) from 2018 to 2019 and the TPC Chair in 2020.



**Yong Chen** (Senior Member, IEEE) received the B.Eng. degree in electronic and information engineering from the Communication University of China (CUC), Beijing, China, in 2005, and the Ph.D. (Eng.) degree in microelectronics and solid-state electronics from the Institute of Microelectronics, Chinese Academy of Sciences (IMECAS), Beijing, China, in 2010.

From 2010 to 2013, he worked as a Post-Doctoral Researcher with the Institute of Microelectronics, Tsinghua University, Beijing. From 2013 to 2016, he was a Research Fellow at VIRTUS/EEE, Nanyang Technological University, Singapore. He has been an Assistant Professor with the State Key Laboratory of Analog and Mixed-Signal VLSI (AMSV), University of Macau, Taipa, Macau, since March 2016. His research interests include integrated circuit designs involving analog/mixed-signal/RF/mm-wave/sub-THz/wireline.

Dr. Chen was a recipient of the “Haixi” (three places across the Straits) postgraduate integrated circuit design competition (Second Prize) in 2009, a co-recipient of the Best Paper Award at the IEEE Asia Pacific Conference on Circuits and Systems (APCCAS) in 2019, the Best Student Paper Award (Third Place) at the IEEE Radio Frequency Integrated Circuits (RFIC) Symposium in 2021, and the Macao Science and Technology Invention Award (First Prize) in 2020. His team reported 3-chip inventions at the IEEE International Solid-State Circuits Conference-ISSCC (Chip Olympics): mm-wave PLL (2019) and VCO (2019), and radio frequency VCO (2021). He serves as the Vice-Chair (2019–2021) and the Chair (2021–2023) of IEEE Macau CAS Chapter; the Tutorial Chair of ICCS (2020); a Conference Local Organization Committee of A-SSCC (2019); a member of IEEE Circuits and Systems Society, Circuits and Systems for Communications (CASCOM) Technical Committee (2020–2022) and the Technical Program Committee (TPC) of A-SSCC (2021), APCCAS (2019–2021), ICTA (2020–2021), NorCAS (2020–2021), ICECS (2021), and ICSICT (2020); a Review Committee Member of ISCAS (2021–2022); and the TPC Co-Chair of ICCS (2021). He has been serving as an Associate Editor for IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS since 2019 and *Electronics Letters* (IET) since 2020, an Editor for *International Journal of Circuit Theory and Applications* (IJCTA) since 2020, a Guest Editor for IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: EXPRESS BRIEFS in 2021, and an Associate Editor for IEEE ACCESS (2019–2021).



**Xi Xiao** received the B.S. and M.S. degrees from the Huazhong University of Science and Technology, Wuhan, China, in 2005 and 2007 respectively, and the Ph.D. degree from the Institute of Semiconductors, Chinese Academy of Sciences (ISCAS), Beijing, China, in 2010.

He is currently the Director of the Silicon Photonics Lab, FiberHome Technologies, Wuhan; the Vice Director of the State Key Laboratory of Optical Communication Technologies and Networks of China; and the CEO of the National Information Optoelectronics Innovation Center, China. He worked as an Assistant Professor and an Associate Professor with the Institute of Semiconductors, Chinese Academy of Sciences, from 2010 to 2013. His current research interests include the high-speed silicon-based PICs and EPICs for optical communication and optical interconnects, as well as their enabling fabrication and integration technologies.



**Nan Qi** (Member, IEEE) received the B.S. degree from the Beijing Institute of Technology, Beijing, China, in 2005, and the M.S. and Ph.D. degrees in microelectronics from Tsinghua University, Beijing, in 2008 and 2013, respectively.

From 2013 to 2015, he was a Research Scholar with the Department of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR, USA, where he developed high-speed CMOS drivers for Si-photonics modulators. From 2015 to 2017, he was a Visiting Scholar and then a Circuit-Design Engineer at Hewlett-Packard Labs, Palo Alto, CA, USA, where he focused on high-speed circuits co-designed with silicon photonics. In 2017, he joined the Institute of Semiconductors, Chinese Academy of Sciences, Beijing, where he is currently a Full Professor on electrical circuits and systems. His research interests include the design of integrated circuits for high-speed wireline/optical and wireless transceivers.

Dr. Qi serves as an Associate Editor for *IET Circuits, Devices & Systems* (CDS) since 2019, and a member of the Technical Program Committee of IEEE A-SSCC (2019) and ICTA (2018–2021).