

# Co-Designed Silicon Photonics Chip I/O for Energy-Efficient Petascale Connectivity

Yuyang Wang<sup>✉</sup>, Member, IEEE, Songli Wang, Robert Parsons, Swarnava Sanyal<sup>✉</sup>, Vignesh Gopal, Asher Novick, Member, IEEE, Anthony Rizzo<sup>✉</sup>, Member, IEEE, Michal Lipson, Fellow, IEEE, Alexander L. Gaeta, Fellow, IEEE, and Keren Bergman<sup>✉</sup>, Fellow, IEEE

**Abstract**—Data volume in hyperscale computing systems has surged exponentially over the past decade, notably driven by artificial intelligence (AI)/machine learning (ML) applications and the emergence of large-scale generative AI models. An urgent need arises for ultrahigh-bandwidth and energy-efficient communications among compute clusters to support the application demands. Embedded silicon photonics (SiPh) promises to enable petascale system-wide connectivity by integrating optical input/output (I/O) directly into the compute socket. SiPh microresonator-based modulators and filters, known for their excellent wavelength selectivity and compact footprints, offer an elegant solution for realizing dense wavelength-division multiplexing (DWDM) links with ultrahigh bandwidth density, leveraging the latest advances in optical frequency comb (OFC) sources and 3-D integration with electronics. In this work, we present our scalable DWDM link architecture, designed with co-packaging in mind. We report device-level measurements of key components and validate comb-driven end-to-end data transmission. These results demonstrate promise in realizing co-packaged optical I/Os with shoreline and aerial bandwidth densities beyond 4 Tbps/mm and 17 Tbps/mm<sup>2</sup> while consuming

sub-pJ/b energy, paving the way for petascale photonic connectivity for energy-efficient computing.

**Index Terms**—Energy efficiency, optical interconnections, photonic integrated circuits (PIC), silicon photonics (SiPh), wavelength division multiplexing.

## I. INTRODUCTION

HYPERSCALE computing infrastructures, such as data centers and high-performance computing systems, are hitting a major connectivity bottleneck across clusters of computing units (CUs) [1]. This bottleneck is prominently driven by the exponential growth of data traffic demanded by AI and machine learning (ML) applications [2]. With the emergence of generative AI, large models with over 100 trillion parameters will soon necessitate training across millions of cores [3]. Tremendous effort has been made to improve the capabilities of these compute clusters, notably through the integration of specialized accelerators (e.g., GPUs [4] and TPUs [5]) and high-bandwidth memory (HBM) [6] modules with high-speed local interconnects [7], [8], [9]. However, such progress has not been matched by the link technologies that connect these accelerators and HBMs across clusters [10], [11]. As a result, a bandwidth discrepancy by two orders of magnitude has emerged across the system stack in state-of-the-art accelerator clusters [12], fundamentally limiting their performance and scalability toward exascale [13].

Optical interconnects leveraging low-loss fibers have been recognized as a promising solution to provide uniformly high-bandwidth communications across a wide range of link distances [14]. To eliminate the current system-wide bandwidth discrepancy, link technologies capable of providing multi-Tbps bandwidth are in imminent demand for the next decade [15]. The main obstacle to their practical deployment, however, lies at chip edges where electrical-optical (EO) and optical-electrical (OE) conversions consume excessive power and area [16]. Specifically, existing solutions based on pluggable optics still require electrical signals to travel over centimeters of copper wires between the CUs and the EO/OE interfaces [17], as illustrated in Fig. 1(a). Such a form factor is not scalable to the projected bandwidth capacity and density requirements without incurring prohibitive energy consumption. The next-generation optical I/O technologies must therefore embrace a closer integration with—eventually

Received 16 August 2024; accepted 23 October 2024. Date of publication 6 November 2024; date of current version 22 August 2025. This work was supported in part by U.S. Defense Advanced Research Projects Agency (DARPA) under Common Heterogeneous Integration and IP Reuse Strategies (CHIPS) Program under Contract HR00111830002, in part by DARPA under Photonics in the Package for Extreme Scalability (PIPES) Program under Contract HR00111920014, and in part by the Center for Ubiquitous Connectivity (CUBiC), through the Semiconductor Research Corporation (SRC) and DARPA under the JUMP 2.0 Program. Recommended for publication by Associate Editor H. Thacker upon evaluation of reviewers' comments. (*Corresponding author: Keren Bergman*)

Yuyang Wang, Songli Wang, Robert Parsons, Vignesh Gopal, and Keren Bergman are with the Department of Electrical Engineering, Columbia University in the City of New York, New York, NY 10027 USA (e-mail: yw3831@columbia.edu; sw3400@columbia.edu; rp3020@columbia.edu; vvg2113@columbia.edu; bergman@ee.columbia.edu).

Swarnava Sanyal is with the Department of Applied Physics and Applied Mathematics, Columbia University, New York, NY 10027 USA (e-mail: ss6140@columbia.edu).

Asher Novick was with the Department of Electrical Engineering, Columbia University in the City of New York, New York, NY 10027 USA. He is now with Xscape Photonics Inc., Fort Lee, NJ 07024 USA (e-mail: asher@xscapephotronics.com).

Anthony Rizzo was with the Department of Electrical Engineering, Columbia University in the City of New York, New York, NY 10027 USA. He is now with the Thayer School of Engineering, Dartmouth College, Hanover, NH 03755 USA (e-mail: anthony.j.rizzo@dartmouth.edu).

Michal Lipson and Alexander L. Gaeta are with the Department of Electrical Engineering and the Department of Applied Physics and Applied Mathematics, Columbia University in the City of New York, New York, NY 10027 USA (e-mail: ml3745@columbia.edu; a.gaeta@columbia.edu).

Digital Object Identifier 10.1109/TCMT.2024.3492189



Fig. 1. Scalable co-packaged optical I/O. (a) Existing pluggable optics require long copper traces between CUs and electrical-optical (EO)/optical-electrical (OE) interfaces, leading to prohibitive energy consumption and limited bandwidth density. (b) Co-packaged optics (CPO) leveraging 3-D integration and advanced packaging enable tight integration of optical I/O within compute sockets, maximizing the bandwidth density–energy efficiency product. (c) Enhancing the figure of merit for optical I/O through massive wavelength parallelism at moderate data rates per channel. Data from [19], [20], [21], [22], [23], [24], [25], [26], [27]. (d) Scalable link architecture enabled by even-odd (de-)interleavers and cascaded microresonator modulators/filters [20], [28]. (e) Measured optical spectrum of the comb source for link co-design. (f) Valid multi-FSR channel arrangement allowing up to 17 channels per bus achievable by microresonators with a 25.69 nm FSR. Dashed dips indicate resonance aliases, with asterisks (primes) denoting those to the blue (red) side of the nominal resonance.

embedded within—the compute sockets [Fig. 1(b)] to maximize the bandwidth density–energy efficiency product across the system [18].

Embedded SiPh provides a promising pathway to address the above challenge. This promise primarily lies in its scalability and cost-effectiveness, achieved by supporting DWDM through a CMOS-compatible fabrication process [29], [30], [31]. Link architectures leveraging SiPh microresonator-based modulators and filters [32], [33], [34] are of particular interest for packing numerous parallel wavelengths into a compact form factor [35], [36]. As recently shown [37], [38], utilizing more parallel channels at a moderate data rate per channel is particularly advantageous for achieving both high aggregate bandwidth and high energy efficiency. Fig. 1(c) further illustrates the benefits of such design principles for optical I/Os. This approach, combined with recent advances in OFC sources [39], [40], [41], [42], enables an elegant

implementation of massive DWDM without relying on bulky laser arrays or dedicated (de-)multiplexers [43], [44], [45], [46], [47].

Among various OFC technologies, silicon nitride ( $\text{Si}_3\text{N}_4$ ) microresonator-based Kerr frequency combs have attracted significant interest due to their compact footprint, CMOS compatibility, and capability of generating hundreds of evenly spaced, low-noise frequency channels from a single continuous-wave laser source [48], [49]. Kerr combs operating in the normal group velocity dispersion (GVD) regime have also demonstrated better conversion efficiency, power per line, and spectral flatness than those of alternative technologies [50], [51], [52], [53]. Leveraging this ultra-broad optical spectrum for massive wavelength parallelism, however, sparks link design challenges across device, architecture, and packaging levels. In this work, we present a scalable link architecture and novel enabling devices for realizing

ultrahigh-bandwidth and energy-efficient Kerr comb–driven DWDM data links. We demonstrate hardware validations of key components such as broadband (de-)interleavers, custom vertical-junction (VJ) microdisk modulators, and resonant add-drop filters, and validate end-to-end data transmission driven by the Kerr comb. We also discuss packaging practices and considerations for 3-D integration of electronic drivers and evaluate pathways toward achieving optical I/O bandwidth densities beyond 4 Tbps/mm and 17 Tbps/mm<sup>2</sup> with sub-pJ/b energy consumption. These results demonstrate promise in realizing petascale photonic connectivity for future energy-efficient computing.

## II. SCALABLE LINK ARCHITECTURE

To leverage the massive wavelength parallelism provided by the Kerr comb source, a fundamental limitation of the traditional *single-bus* architecture must be addressed. Specifically, this conventional architecture cascading multiple resonators along a single bus can only accommodate a limited optical bandwidth, upper bounded by the free spectral range (FSR) of the resonators, which is the optical frequency spacing between two successive resonances of the same resonator. With the comb spectrum spanning multiple resonator FSRs, due to periodic resonances, a resonator might capture multiple comb lines simultaneously, leading to significant crosstalk penalties. It is thus essential to co-design the link architecture with the comb source such that the nontarget resonances, referred to as *resonance aliases*, do not interfere with any comb lines within the optical band of interest. In this section, we present our scalable *multibus* link architecture as illustrated in Fig. 1(d) and previously featured in [20] and [28]. The transceiver is co-designed with a Si<sub>3</sub>N<sub>4</sub> dual-ring normal GVD Kerr comb, which has a measured optical spectrum as shown in Fig. 1(e). The main ring dictates the repetition rate of the generated comb and results in an FSR of 100 GHz. At the transmitter (Tx) side, the incoming comb lines are subdivided by two stages of de-interleavers—the FSRs of which are designed to be 200 and 400 GHz, respectively—, onto four buses before traversing separate banks of cascaded microresonator modulators. Each stage of de-interleaver splits the incoming wavelengths into “even” and “odd” groups, effectively doubling the channel spacing while maintaining almost the full optical bandwidth at each output port. Each resonant modulator can modulate a distinct wavelength channel while appearing near transparent to other channels on the bus. The modulated signals from the four buses are recombined by two stages of even-odd interleavers into a single fiber output. Symmetrically, at the receiver (Rx) side, the modulated signals are de-interleaved and sent to respective banks of cascaded microresonator filters that drop each channel onto a photodetector (PD) for sensing.

As mentioned, the resonator’s FSR needs to be co-designed with the comb channels to avoid overlap between resonance aliases and nontarget comb lines. This design methodology has been mathematically formulated as the *multi-FSR channel arrangement* problem [19], where a valid channel arrangement scheme can be elegantly derived by picking a pair of co-prime

integers ( $\mathcal{S}, \mathcal{F}$ ) satisfying

$$\left\{ \begin{array}{l} \mathcal{S} = \frac{\Delta_{ch}}{\Delta_{agg}} \\ \mathcal{F} = \frac{\text{FSR}}{\Delta_{agg}} \end{array} \right. \quad (1a)$$

$$\left\{ \begin{array}{l} \mathcal{S} = \frac{\Delta_{ch}}{\Delta_{agg}} \\ \mathcal{F} = \frac{\text{FSR}}{\Delta_{agg}} \end{array} \right. \quad (1b)$$

where  $\Delta_{ch}$  is the effective channel spacing on each bus after de-interleaving,  $\Delta_{agg}$  is the reduced spacing between a comb channel and its nearest resonance alias aggressor, and FSR is the resonator free spectral range being designed. For the link architecture in Fig. 1(d)—featuring two stages of (de-) interleavers and four buses—we choose  $(\mathcal{S}, \mathcal{F}) = (2, 17)$ , leading to a resonator FSR of 25.69 nm. This choice allows up to 17 channels per bus without any resonance aliases overlapping with nontarget comb lines, as shown in Fig. 1(f). This design thus effectively performs 64-channel DWDM, with one spare channel per bus accounting for potential comb channel imperfections, targeting a 1.024 Tbps/fiber bandwidth capacity at a moderate data rate of 16 Gbps per channel, or 2.048 Tbps/fiber at 32 Gbps per channel.

The closed-form solution to the multi-FSR channel arrangement problem enables exploration of link architecture variations to balance performance trade-offs, as detailed in [54]. For example, with one stage of (de-)interleaving and two buses, the same bandwidth capacity is attainable with  $(\mathcal{S}, \mathcal{F}) = (2, 33)$ , requiring minimal adjustments to the resonator design for an FSR of 24.93 nm. Alternatively, novel resonator designs targeting FSRs greater than 50 nm are also attractive for eliminating the crosstalk concerns of resonance aliases, which have traditionally been challenging to design because the physical dimensions of resonators become prohibitively small at higher FSRs [55]. In the next section, we present innovations in device design to enable the proposed scalable link architecture.

## III. ENABLING DEVICES

The transceiver design involves tight integration of multiple constituent components, notably even-odd (de-)interleavers, and microresonator-based modulators and filters. In this section, we present component-level design and validation of the key building blocks for enabling the proposed link architecture.

### A. Compact Even-Odd (De-)Interleavers

As mentioned in Section II and illustrated in Fig. 1(d), even-odd (de-)interleavers are necessary to expand the effective channel spacing on each bus to accommodate the resonance aliases present in the multi-FSR channel arrangement of the cascaded resonator arrays. While basic Mach–Zehnder interferometer (MZI)-based interleavers are compact and relatively straightforward to design, they are prone to fabrication and environmental perturbations and have a limited channel capacity due to the GVD of silicon-on-insulator waveguides. In our design, we adopt a modified MZI design, known as the ring-assisted MZI (RAMZI), for the required even-odd (de-)interleaving operation. RAMZIs incorporate ring resonators to achieve flat-top pass-bands [56], [57], making it more resilient to both perturbations and the FSR mismatch with respect to



Fig. 2. (De-)interleaver design and characterization [60]. (a) Microscope image of a 400 GHz RAMZI-based interleaver. (b) Schematic of the interleaver. (c) Measured transmission spectra of the interleaver. (d) Schematic of the RAMZI interleaver with an auxiliary monitoring structure for automated tuning. (e) Interleaver spectra with DWDM source before and after autotuning, showing optimized extinction ratio greater 20 dB and the precise alignment of the pass-bands to the DWDM channels.

the comb source, while having a more compact footprint than alternative structures such as cascaded MZIs [58], [59], that achieve a similar flat-top response.

**1) Device Design and Characterization:** The RAMZIs are designed with an FSR that is twice the channel spacing of the incoming wavelength channels to achieve even-odd (de-)interleaving. In the case of this study, the first stage and the second stage (de-)interleavers [in Fig. 1(d)] are designed for 200 and 400 GHz FSRs, respectively. To achieve a broadband flat-top response, a compact multimode interferometer-based coupler with a 15:85 splitting ratio is implemented for effective coupling into the ring. The effective path length of the ring needs to be approximately twice the MZI arm length difference. Fig. 2(a) shows a microscope image of a fabricated 400 GHz RAMZI for example, and Fig. 2(b) shows its schematic design. The measured transmission spectra, as shown in Fig. 2(c), exhibit a flat-top response after applying 1 V to the phase shifter on the MZI arm, with an extinction ratio greater than 20 dB over 50 nm bandwidth for both output ports. The unevenness in the spectrum envelope is primarily due to the grating coupler in the test structure, not the interleaver itself.

**2) Automated Tuning:** The (de-)interleavers in the link design are also equipped with a monitoring PD for implementing automated calibration of phase errors—vital to achieving the desired flat-top response—as well as automated alignment of the pass-/stop-bands with the DWDM channels. To achieve this, we introduce an auxiliary monitoring structure composed of an MZI with an identical FSR to that of the RAMZI, followed by the PD, as illustrated in Fig. 2(d). The resulting photocurrent, as a function of the applied thermal tuning voltages, will reach its maximum when the pass-bands of both the RAMZI and the monitor MZI align with the DWDM channels. We experimentally verified the automated tuning of the interleavers. The setup consists of eight distributed feedback lasers with a 200 GHz channel spacing, acting as the DWDM source. The DWDM channels are combined with a broadband light source, which is for visualizing the interleaver spectrum shape with an optical spectrum analyzer. The power of the broadband source is much smaller than the DWDM source so it has negligible impact on the auto-tuning process. Fig. 2(e) shows the interleaver transmission spectra before and after the automated tuning process, demonstrating an optimized extinction ratio greater 20 dB and the precise alignment of the pass-bands to the DWDM channels after tuning. More details on the interleaver device design, tuning algorithm, and evaluations have been reported in [60].

### B. Custom Microdisk Modulators and Filters

Resonant modulators and filters offer a highly efficient and compact solution for wavelength-selective data encoding and decoding. By cascading multiple microresonators along a single bus waveguide, it forms an array where each resonator independently interacts with the co-propagating carrier wavelengths with negligible crosstalk between channels. Among resonant modulator designs compatible with standard CMOS foundry fabrication processes, vertical-junction (VJ) *microdisk* modulators have emerged as particularly promising. In contrast to lateral- or interleaved-junction *microring* modulators, VJ microdisk modulators exhibit a unique characteristic where a highly confined whispering gallery mode with a high internal quality factor ( $Q$ ) overlaps with a vertically oriented pn diode. This distinctive combination provides unparalleled modulation efficiency due to an improved depletion response.

**1) Interleaved-Contact VJ Microdisk Modulators:** As mentioned in Section II, the ultra-broad comb spectrum prefers a larger resonator FSR, achievable by decreasing the physical size of the resonator. However, as the modulator radius decreases, the series resistance—a critical factor in determining the modulator RC bandwidth—rises, because only a limited number of parallel radio frequency (RF) contacts can be placed while adhering to foundry design rules for minimum metal spacing. This limit is exacerbated with an integrated heater, which requires room for two additional contacts within an already constrained area [55]. To address this challenge, we have developed new contact placement schemes that distribute the RF contacts more evenly across the junctions to reduce the series resistance and improve the modulation bandwidth and efficiency. Compared to the traditional *left-right* scheme where the RF contacts to the P and N junctions



Fig. 3. Interleaved-contact VJ microdisk modulator [63]. (a) Illustration of the interleaved contact scheme. (b) Microscope image of a fabricated modulator. (c) Measured S11 verifying improved RC bandwidth with an increased number of parallel contacts. (d) Eye diagram at 32 Gbps driven with 0.8 V Vpp.

are placed onto two distinct halves of the microdisk [19], [61], [62], our novel *interleaved* contact scheme, as illustrated in Fig. 3(a) and detailed in [63], alternates the RF contacts between P and N, analogous to the arrangement of the interleaved lateral-junction ring modulators [64]. S11 measurement of the device from standalone test structures [Fig. 3(b)] verified the improved RC bandwidth with an increased number of parallel contacts [Fig. 3(c)]. An open eye diagram was obtained at 32 Gbps with PRBS15 signals, driven at a Vpp of only 800 mV [Fig. 3(d)]. This novel implementation of VJ microdisk modulators thus provides a viable pathway for achieving high-bandwidth and energy-efficient DWDM data links.

**2) Modulators With External Half-Etched Heaters:** As an alternative approach mentioned in Section II, which aims at achieving even greater FSRs, we have also designed a novel disk modulator that places a half-height, doped-silicon heater externally around a 2  $\mu\text{m}$  VJ disk [Fig. 4(a)], which pushes the limits of both modulator FSR and thermal efficiency. As detailed in [65], this design leverages the phase mismatch and reduced overlap integral afforded by half-etching the external doped-silicon heater, permitting close placement of the heater around the resonator with a 200 nm gap without disturbing the whispering gallery mode. The fabricated disk modulator exhibits a measured wide FSR of 58.6 nm [Fig. 4(b)] and a thermal tuning efficiency comparable to that of disk modulators with internal heaters [Fig. 4(c)]. Modulation was demonstrated with an open eye diagram at 16 Gbps at 1.3 V Vpp [Fig. 4(d)]. These initial results offer valuable insights for future externally heated resonant modulator designs. Such designs could rival internally heated modulators in performance while achieving manifold FSRs to meet the growing demands of DWDM scaling.

**3) Microdisk Add-Drop Filters:** At the receiver side, we propose using microdisk filters, despite the prevalence of single-mode ring filters in current integrated SiPh DWDM architectures. One significant advantage of employing disk filters is to match the dispersion characteristics of the mod-



Fig. 4. Microdisk modulator with external half-etched heater [65]. (a) Illustration of the modulator design. (b) Measured transmission spectrum showing 58.6 nm FSR. (c) Measured thermal tuning efficiency of 0.6 nm/mW. (d) Eye diagram at 16 Gbps driven with 1.3 V Vpp.

ulators and filters, simplifying the overall design through symmetry. To maintain single-mode operation in these multi-mode disk filters, we strategically place the integrated heaters to introduce loss for higher order modes, effectively suppressing their resonances to ensure optimal filter performance, as reported in [20].

### C. Wafer-Scale Substrate Undercut

Both the (de-)interleavers and the microdisk modulators/filters employ thermal tuning to rectify the process variations and achieve precise alignment with the DWDM channels. To improve the energy efficiency of the transceiver, we have co-developed a wafer-scale substrate undercut (UC) process with our foundry partner, AIM Photonics. As shown in Fig. 5(a)–(c), this isotropic etching process removes the oxide and part of the substrate surrounding and underneath the device of interest, thereby enabling highly efficient thermal tuning and reduced thermal crosstalk. With fully released devices achieved, i.e., devices surrounded by air (mechanically supported by oxide bridges), our wafer-scale measurements of UC microdisk modulators demonstrate a consistent 5 $\times$  improvement in thermal tuning efficiency across 128 reticles on 2 wafers, compared to non-UC devices, as shown in Fig. 5(d) and detailed in [66]. This improvement, along with efficient modulation from the optimized modulator design, provides a viable pathway to achieve sub-pJ/b energy consumption in the proposed DWDM link architecture, as discussed in Section VI.

## IV. LINK VALIDATION

This section presents an end-to-end comb-driven data transmission experiment to validate the proposed link design. Fig. 6(a) is a microscope image of a fabricated test chip for the proposed link architecture. It consists of the 4  $\times$  17 modulator/filter arrays as described by Fig. 1(d) and (f) of Section II, accessible through edge couplers from both ends. Fig. 6(b) shows the spectrum of the comb source driving the



Fig. 5. Wafer-scale substrate undercut for improved thermal tuning efficiency. (a) Microscope image of an undercut microdisk modulator. (b) Cross section of the undercut microdisk modulator. (c) Rendered view from inside the undercut looking up at the modulator. (d) Measured thermal tuning efficiency improvement of undercut microdisk modulators across two wafers [66].

test link. The schematic of the experimental setup is illustrated in Fig. 6(c), in which two test chips are connected by a 20-m-long fiber, with one serving as the Tx and the other as the Rx. Due to constraints associated with the channel count of the electrical probe, multiple comb lines are modulated sequentially using a commercial bandpass filter after the comb output. The filtered carrier wavelength is then amplified through a thulium-doped fiber amplifier (TDFA) and guided through a polarization controller (PC) to optimize the insertion loss into the Tx chip. A direct current (dc) power supply is connected to the disk modulator phase shifter to align the modulator resonance with the target comb line. Modulation is performed by an Anritsu MP1900A pulse pattern generator generating a 16 Gbps PRBS15 signal with a peak-to-peak voltage of 1 V. The modulated signals are first measured by a Keysight N1092C sampling oscilloscope, and the resulting open *optical* eyes are presented in Fig. 6(d) (yellow). Then, the output from the Tx chip is amplified by a second TDFA, traverses a 20-m-long for consistency with above, and passes through a PC before entering the Rx chip. A bias tee facilitates the connection between the on-chip photodetector and the sampling oscilloscope, while a dc power supply applies a reverse bias of  $-1$  V to the photodetector. A second dc power supply interfaces with the phase shifter of the microdisk filter, analogous to the Tx setup. The *electrical* eye diagrams from the Rx photodetectors showcasing the received signals are also presented in Fig. 6(d) (green). A total of ten wavelength channels (20 eye diagrams) are collected, demonstrating the efficacy of the proposed link architecture.

## V. PACKAGING FOR HIGH BANDWIDTH DENSITY

In addition to architectural innovations, the packaging of the photonic integrated circuit (PIC) alongside its driver electronic integrated circuit (EIC) and compute chips is essential for achieving high escape bandwidth and bandwidth density from the optical I/O. Current co-packaged optics implementations,

notably based on monolithic integration of the EIC and the PIC [35], face density challenges due to the need for the electronic drivers to be placed in the same plane as the photonic components. This approach also limits the use of advanced CMOS nodes for the EIC, leading to performance and energy concerns. To this end, 3-D integration of the EIC and the PIC at a dense  $\mu$ bump pitch becomes essential to push the areal bandwidth density beyond the  $\text{Tbps}/\text{mm}^2$  regime. As shown in Fig. 7 and detailed in [21], we have demonstrated a record-high  $5.3 \text{ Tbps}/\text{mm}^2$  multichip module (MCM) prototype featuring a 3-D flip-chip bonded EIC over the PIC. We push the limits of this bonding technology by using a  $15 \mu\text{m}$  spacing and  $10 \mu\text{m}$  bump diameters ( $25 \mu\text{m}$  pitch) in an array of 2,304 bonds. The 3-D integrated module contains 80 Tx cells and 80 Rx cells; these cells are organized into 20 waveguide buses, each with four wavelength channels. Eye diagrams for all 80 modulators on the PIC at 10 Gbps/channel are successfully measured with all EIC channels driving at 1 V Vpp simultaneously. The receivers are characterized with a  $-24.85 \text{ dBm}$  sensitivity for a bit error rate (BER) of  $4 \times 10^{-10}$  at 10 Gbps [21], [67]. Incorporating the validated interleaved-contact microdisk modulator design featured in Section III-B1, we have designed and fabricated a next-generation MCM with 64 channels each targeting 32 Gbps under the same  $\mu$ bump pitch. This updated design thus promises to deliver an aerial bandwidth density of  $17 \text{ Tbps}/\text{mm}^2$ .

Besides aerial bandwidth density, the shoreline bandwidth density is also a critical metric for optical I/Os co-packaged with computing units in a multichip package [Fig. 8(a)]. For a given pitch of the fiber array unit, e.g.,  $127 \mu\text{m}$ , the shoreline bandwidth density can be maximized if all components contributing to the bandwidth capacity per fiber are placed within a narrow strip matching the fiber pitch. This further highlights the necessity of 3-D integration to avoid placing the EIC and PIC in the same plane. In [20], we further pushed the limits of package shoreline bandwidth density by designing a photonic I/O chiplet consisting of 16 transceivers with the architecture of Fig. 1(d) and (f). As shown in Fig. 8(b), each chiplet is integrated with 1,024 microdisk modulators and 1,024 microdisk filters. The modulators feature a four-contact interleaved design as seen in Fig. 3(c), supporting up to 16 Gbps/channel. The I/O chiplet thus packs an aggregate bandwidth of 16 Tbps for both Tx and Rx along a single 8.1 mm shoreline, equivalent to over  $2 \text{ Tbps}/\text{mm}$  shoreline bandwidth density. We have also incorporated the updated microdisk modulator design into our latest version of the PIC design, aiming to double the shoreline bandwidth density to over  $4 \text{ Tbps}/\text{mm}$  with 32 Gbps/channel achievable. Characterization and demonstration of these high-bandwidth density package prototypes in future works will be a significant step toward realizing the envisioned petascale connectivity.

## VI. LINK BUDGET AND ENERGY EFFICIENCY

We perform a link budget analysis to assess the energy efficiency of the proposed link architecture. As shown in Fig. 9, the link budget starts from the optical power provided by the comb source. After accounting for various loss terms



Fig. 6. End-to-end comb-driven data transmission experiment. (a) Microscope image of the photonic integrated circuit (PIC) test chip consisting of the  $4 \times 17$  modulator/filter arrays following the architectural design described in Fig. 1(d) and (f). (b) Spectrum of the comb source driving the test link. (c) Schematic of the experimental setup. (d) Tx optical and Rx electrical eye diagrams measured from ten wavelength channels.



Fig. 7. High-density multichip module (MCM) with 3-D-integrated EIC and photonic integrated circuit (PIC) [21], [67]. (a) Photo and (b) microscope image of the multichip module (MCM) prototypes. (c) Microbumped photonic chip after fabrication. (d)  $\mu$ bumps and (e) metal pads on photonic integrated circuit (PIC) at 25  $\mu\text{m}$  pitch. (f) Cross-sectional stack diagram of the EIC and the PIC.

and power penalties, the optical power at the Rx end must meet the sensitivity requirement for a specified BER. The static loss terms are based on measurement results from fabricated devices and test structures, including comb-to-fiber coupling loss, PIC coupling loss, (de-)interleaver insertion loss (IL), modulator off-resonance loss, waveguide propagation loss, filter off-resonance loss, and filter on-resonance loss (drop-port loss). The modulators also introduce power penalties to the link budget based on the driving V<sub>pp</sub>, due to the finite extinction ratios and insertion losses. As described in [55], the

total power penalty from modulation is the sum of the insertion loss (IL<sub>mod</sub>), the extinction ratio (ER) penalty (PP<sub>ER</sub>), and the on-off keying (OOK) penalty (PP<sub>OOK</sub>):

$$\text{PP}_{\text{total}} = \text{IL}_{\text{mod}} + \text{PP}_{\text{ER}} + \text{PP}_{\text{OOK}} \quad (2)$$

where

$$\text{PP}_{\text{ER}} = 10 \log_{10} \left( \frac{r+1}{r-1} \right) \quad (3a)$$

$$\text{PP}_{\text{OOK}} = 10 \log_{10} \left( \frac{2r}{r+1} \right). \quad (3b)$$

TABLE I  
LINK BUDGET CALCULATION FOR 32 Gbps/CHANNEL AT 0.8 V Vpp

| Loss Component                | Value        | Unit | Multiplier | Source                      |
|-------------------------------|--------------|------|------------|-----------------------------|
| <b>Transmitter Loss Total</b> | <b>14.56</b> | dB   |            |                             |
| Comb to Fiber                 | 2.20         | dB   |            | Measured                    |
| Edge Coupler In               | 0.92         | dB   |            | Measured                    |
| De-Interleaver Stages         | 0.35         | dB   | × 2        | Measured                    |
| Modulator Off-Resonance       | 0.10         | dB   | × 15       | Measured                    |
| Modulator Power Penalties     | 7.29         | dB   |            | Measured Depletion Response |
| Crosstalk Penalty             | –            | dB   |            | > 100 GHz Channel Spacing   |
| Interleaver Stages            | 0.35         | dB   | × 2        | Measured                    |
| Propagation Loss              | 0.33         | dB   |            | PIC Layout                  |
| Edge Coupler Out              | 0.92         | dB   |            | Measured                    |
| <b>Receiver Loss Total</b>    | <b>3.32</b>  | dB   |            |                             |
| Edge Coupler In               | 0.92         | dB   |            | Measured                    |
| De-Interleaver Stages         | 0.35         | dB   | × 2        | Measured                    |
| Filter Off-Resonance          | 0.10         | dB   | × 15       | Measured                    |
| Filter On-Resonance           | 0.20         | dB   |            | Measured                    |
| Crosstalk Penalty             | –            | dB   |            | > 100 GHz Channel Spacing   |
| <b>Link Loss Total</b>        | <b>17.88</b> | dB   |            |                             |
| + Receiver Sensitivity        | -19.36       | dBM  |            | For 32 Gbps/Channel         |
| = Min. Power per Line         | <b>-1.49</b> | dBM  |            |                             |



Fig. 8. High-bandwidth density photonic I/O chiplet for co-packaging with computing units [20]. (a) Conceptual multichip package hosting multiple 3-D-integrated EIC/PIC pairs next to a compute chip, providing high-density optical data I/O. (b) Detailed PIC chiplet floor plan demonstrating feasibility toward 4 Tbps/mm shoreline bandwidth density.

Here,  $r$  represents the modulation ER in linear scale, defined as  $r = 10^{\text{ER}/10}$ .  $\text{IL}_{\text{mod}}$  and ER can be calculated from the measured disk depletion response [Fig. 10(a)] for a given voltage swing. The total power penalty for a target modulation Vpp is then calculated based on the detuning distance that achieves the maximum optical modulation amplitude (OMA). Fig. 10(b) shows the case for a 0.8 V Vpp, where the total

TABLE II  
ENERGY-PER-BIT BREAKDOWN FOR 32 Gbps/CHANNEL AT 0.8 V Vpp

| Component    | Energy [fJ/b]<br>w/o undercut | Energy [fJ/b]<br>w/ undercut |
|--------------|-------------------------------|------------------------------|
| <b>Comb*</b> | Comb Generation               | 148.0                        |
|              | Comb Thermal                  | 24.9                         |
| <b>EIC</b>   | Tx Driver                     | 40.0                         |
|              | Rx TIA                        | 100.4                        |
| <b>PIC</b>   | Interleaver Thermal           | 35.2                         |
|              | Modulator Thermal             | 58.0                         |
|              | Filter Thermal                | 26.0                         |
| <b>Total</b> | 432.4                         | 337.1                        |

\* Assuming 15 % overall comb WPE.

power penalty from modulation is found as 7.29 dB. A larger Vpp generally results in a lower power penalty, at the cost of higher energy consumption from the Tx driver circuitry.

To evaluate the energy efficiency of the link, we back-calculate the required optical power per line at the comb output from the Rx sensitivity requirement for the specified data rate:

$$P_{\text{sens}} = \frac{Q \cdot i_n^{\text{rms}}}{R} \quad (4)$$

where  $R = 1.1 \text{ A/W}$  is the measured responsivity of the PD,  $Q \approx 7.035$  for a BER of  $10^{-12}$ , and  $i_n^{\text{rms}}$  is the input-referred root mean square (rms) noise current. The sensitivity model is calibrated by the measured result reported in [67] and [21] and the simulation result of the EIC design used in [20], as shown in Fig. 10(c). Finally, the minimum power per line needed at the comb output can be calculated as a function of both the modulation Vpp and the target data rate, as shown in Fig. 10(d). In Table I, we summarize the detailed link budget calculation for a 0.8 V Vpp and 32 Gbps/channel nominal design.



Fig. 9. Illustration of the link budget for the proposed link architecture. The minimum optical power per line at the comb output is back-calculated from the Rx sensitivity requirement for a given data rate, accounting for several measured optical losses in the link and V<sub>pp</sub>-dependent power penalties from the modulators.



Fig. 10. (a) Measured depletion response of the microdisk modulators is used to calculate (b) modulation power penalties for a given V<sub>pp</sub>. (c) Rx sensitivity modeled as a function of data rate. (d) Back-calculated minimum optical power per line for various modulation V<sub>pp</sub> and data rates. The stars denote the nominal design point for 0.8 V V<sub>pps</sub> and 32 Gbps/channel.

We then derive an energy-per-bit breakdown for the link architecture, showing the energy contributions from each constituent component. For the comb source, we assume an overall wall-plug efficiency (WPE) of 15%, corresponding to a 35% pump WPE [68] multiplied by a 43% measured pump-to-comb conversion efficiency. The energy consumption of the Tx driver equals  $(1/4)CV^2$  [69], where  $C$  is the capacitance being charged or discharged during a bit transition.



Fig. 11. Energy-per-bit percentage of each link component for 32 Gbps/channel at 0.8 V Vpp, assuming thermal undercut.

Capacitance sources include the microdisk p-n-junction, bond pads, and the capacitances within the driver circuitry, totaling  $\sim 200$  fF [21]. The Rx transimpedance amplifier (TIA) energy as a function of the data rate is modeled as a polynomial fit over the data points provided in various literature for similar designs [21], [70], [71], [72]. The PIC thermal tuning energy is calculated from measured fabrication variations which tell the average tuning distance required for each resonant device. Finally, the energy-per-bit breakdown for the 0.8 V V<sub>pp</sub> and 32 Gbps/channel nominal design is summarized in Table II for both with and without thermal undercut. Assuming 5× improvement in the thermal tuning efficiency achievable by thermal undercut, the link can achieve an energy efficiency of 0.34 pJ/b from the comb source, the EIC drivers, and the PIC thermal tuning altogether. An energy-per-bit percentage breakdown is further shown in Fig. 11 for the case with

thermal undercut. Note that the energy breakdown does not include EIC components such as the clocking circuitry, which are less dependent on specific link architectures and thus not explored in this work. However, since they could potentially add 0.3–0.6 pJ/b to the total energy consumption [20], [71], it renders optimizations in photonics design, as conducted in this work, even more critical for achieving the envisioned sub-pJ/b energy efficiency. In future work, we will continue to optimize the link budget, such as the comb-to-fiber coupling loss and the modulator power penalties, through device-architecture-packaging co-design and co-optimization.

## VII. CONCLUSION

In this work, we presented a scalable co-designed silicon photonics chip I/O to address the urgent need for ultrahigh-bandwidth and energy-efficient data links in hyperscale computing systems. By leveraging massive DWDM and advanced packaging techniques, we demonstrated pathways toward achieving shoreline and aerial bandwidth densities beyond 4 Tbps/mm and 17 Tbps/mm<sup>2</sup> while maintaining sub-pJ/b energy consumption. Hardware validations at device, link, and package levels underscore the potential and viability of the proposed architecture for realizing petascale system-wide connectivity. Future work will focus on further optimizations of device design and link budget through a cross-level co-design approach, paving the way for broader adoption of silicon photonics optical I/Os in next-generation computing systems.

## REFERENCES

- [1] S. Rumley, K. Bergman, M. A. Seyed, and M. Fiorentino, “Evolving requirements and trends of HPC,” in *Proc. Springer Handbook Opt. Netw.*, B. Mukherjee, I. Tomkos, M. Tornatore, P. Winzer, and Y. Zhao, Eds., Cham, Switzerland: Springer, 2020, pp. 725–755.
- [2] D. Narayanan et al., “Efficient large-scale language model training on GPU clusters using megatron-LM,” in *Proc. Int. Conf. High Perform. Comput., Netw., Storage Anal.*, New York, NY, USA, Nov. 2021, pp. 1–14.
- [3] Z. Ma et al., “BaGuaLu: Targeting brain scale pretrained models with over 37 million cores,” in *Proc. 27th ACM SIGPLAN Symp. Princ. Pract. Parallel Program.*, Seoul, South Korea, Apr. 2022, pp. 192–204.
- [4] W. J. Dally, S. W. Keckler, and D. B. Kirk, “Evolution of the graphics processing unit (GPU),” *IEEE Micro*, vol. 41, no. 6, pp. 42–51, Nov. 2021.
- [5] N. Jouppi et al., “TPU v4: An optically reconfigurable supercomputer for machine learning with hardware support for embeddings,” in *Proc. 50th Annu. Int. Symp. Comput. Archit.*, New York, NY, USA, Jun. 2023, pp. 1–14.
- [6] M. O’Connor et al., “Fine-grained DRAM: Energy-efficient DRAM for extreme bandwidth systems,” in *Proc. IEEE/ACM Int. Symp. Microarchit.*, Oct. 2017, pp. 41–54. [Online]. Available: <https://ieeexplore.ieee.org/document/8686544>
- [7] NVIDIA DGX B200 Datasheet. Accessed: Nov. 17, 2024. [Online]. Available: <https://resources.nvidia.com/en-us-dgx-systems/dgx-b200-datasheet>
- [8] A. Smith et al., “11.1 AMD Instinct™ MI300 series modular chiplet package—HPC and AI accelerator for exa-class systems,” in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, San Francisco, CA, USA, Feb. 2024, pp. 490–492.
- [9] M. Zhu, Y. Zhuo, C. Wang, W. Chen, and Y. Xie, “Performance evaluation and optimization of HBM-enabled GPU for data-intensive applications,” *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 26, no. 5, pp. 831–840, May 2018.
- [10] S. Pati, S. Aga, M. Islam, N. Jayasena, and M. D. Sinclair, “Tale of two Cs: Computation vs. communication scaling for future transformers on future hardware,” in *Proc. IEEE Int. Symp. Workload Characterization (IISWC)*, Ghent, Belgium, Oct. 2023, pp. 140–153.
- [11] A. Gholami, Z. Yao, S. Kim, C. Hooper, M. W. Mahoney, and K. Keutzer, “AI and memory wall,” 2024, *arXiv:2403.14123*.
- [12] Z. Wu et al., “Peta-scale embedded photonics architecture for distributed deep learning applications,” *J. Lightw. Technol.*, vol. 41, no. 12, pp. 3737–3749, Jun. 15, 2023.
- [13] R. Lucas et al., “DOE advanced scientific computing advisory subcommittee (ASCAC) report: Top ten exascale research challenges,” U.S. Dept. Energy, Office Scientific Tech. Inf., Washington, DC, USA, Tech. 1222713, Feb. 2014.
- [14] D. A. B. Miller, “Rationale and challenges for optical interconnects to electronic chips,” *Proc. IEEE*, vol. 88, no. 6, pp. 728–749, Jun. 2000.
- [15] *InfiniBand Roadmap—Advancing InfiniBand*. Accessed: Nov. 17, 2024. [Online]. Available: <https://www.infinibandta.org/infiniband-roadmap/>
- [16] B. G. Lee, N. Nedovic, T. H. Greer, and C. T. Gray, “Beyond CPO: A motivation and approach for bringing optics onto the silicon interposer,” *J. Lightwave Technol.*, vol. 41, no. 4, pp. 1152–1162, Feb. 15, 2023.
- [17] N. Margalit, C. Xiang, S. M. Bowers, A. Bjorlin, R. Blum, and J. E. Bowers, “Perspective on the future of silicon photonics and electronics,” *Appl. Phys. Lett.*, vol. 118, no. 22, May 2021, Art. no. 220501.
- [18] R. Mahajan et al., “Co-packaged photonics for high performance computing: Status, challenges and opportunities,” *J. Lightw. Technol.*, vol. 40, no. 2, pp. 379–392, Aug. 13, 2022.
- [19] A. Rizzo et al., “Petabit-scale silicon photonic interconnects with integrated Kerr frequency combs,” *IEEE J. Sel. Topics Quantum Electron.*, vol. 29, no. 1, pp. 1–20, Jan. 2023.
- [20] Y. Wang et al., “Silicon photonics chip I/O for ultra high-bandwidth and energy-efficient die-to-die connectivity,” in *Proc. IEEE Custom Integr. Circuits Conf. (CICC)*, Denver, CO, USA, Apr. 2024, pp. 1–8.
- [21] S. Daudlin et al., “3D photonics for ultra-low energy, high bandwidth-density chip data links,” 2023, *arXiv:2310.01615*.
- [22] (Mar. 2024). *Intel® Shows OCI Optical I/O Chiplet Co-Packaged With CPU at OFC2024, Enabling Explosive AI Scaling*. [Online]. Available: <https://community.intel.com/t5/Blogs/Tech-Innovation-Artificial-Intelligence-AI/Intel-Shows-OCI-Optical-I-O-Chiplet-Co-packaged-with-CPU-at/post/1582541>
- [23] Ayar Labs at OFC 2024: Leading Optical I/O Innovation. Accessed: Nov. 17, 2024. [Online]. Available: <https://ayarlabs.com/news/ayar-labs-to-showcase-optical-interconnect-solutions-to-redefine-ai-at-ofc-2024/>
- [24] (Mar. 2024). *RANOVUS Delivers Industry’s First 6.4Tbps Co-Packaged Optics With Integrated Laser for AI/ML Application at OFC 2024*. [Online]. Available: <https://ranovus.com/ranovus-delivers-industrys-first-6-4tbps-co-packed-optics-with-integrated-laser-for-ai-ml-application-at-ofc-2024/>
- [25] *1.6T-QSFP224-AI and Data Center Networking*. Accessed: Nov. 17, 2024. [Online]. Available: <https://www.innolight.com/en/goods/info.html?cid=8>
- [26] *800G-QSFP112-DD-AI and Data Center Networking*. Accessed: Nov. 17, 2024. [Online]. Available: <https://www.innolight.com/en/goods/info.html?cid=5>
- [27] *800G-QSFP224-AI and Data Center Networking*. Accessed: Nov. 17, 2024. [Online]. Available: <https://www.innolight.com/en/goods/info.html?cid=7>
- [28] Y. Wang et al., “Scalable architecture for sub-pJ/b multi-Tbps comb-driven DWDM silicon photonic transceiver,” *Proc. SPIE*, vol. 12429, pp. 271–288, Mar. 2023.
- [29] A. H. Atabaki et al., “Integrating photonics with silicon nanoelectronics for the next generation of systems on a chip,” *Nature*, vol. 556, no. 7701, pp. 349–354, Apr. 2018.
- [30] C. Sun et al., “Single-chip microprocessor that communicates directly using light,” *Nature*, vol. 528, no. 7583, pp. 534–538, Dec. 2015.
- [31] R. Soref, “The past, present, and future of silicon photonics,” *IEEE J. Sel. Topics Quantum Electron.*, vol. 12, no. 6, pp. 1678–1687, Nov. 2006.
- [32] E. Timurdogan, C. M. Sorace-Agaskar, J. Sun, E. S. Hosseini, A. Biberman, and M. R. Watts, “An ultralow power athermal silicon modulator,” *Nature Commun.*, vol. 5, no. 1, p. 4008, Jun. 2014.
- [33] W. Bogaerts et al., “Silicon microring resonators,” *Laser Photon. Rev.*, vol. 6, no. 1, pp. 47–73, Jan. 2012.
- [34] Q. Xu, B. Schmidt, S. Pradhan, and M. Lipson, “Micrometre-scale silicon electro-optic modulator,” *Nature*, vol. 435, no. 7040, pp. 325–327, May 2005.
- [35] M. Wade et al., “TeraPHY: A chiplet technology for low-power, high-bandwidth in-package optical I/O,” *IEEE Micro*, vol. 40, no. 2, pp. 63–71, Mar. 2020.

- [36] Q. Xu, B. Schmidt, J. Shakya, and M. Lipson, "Cascaded silicon micro-ring modulators for WDM optical interconnection," *Opt. Exp.*, vol. 14, no. 20, p. 9431, 2006.
- [37] D. Tonietto, "Connecting switch to fiber: The energy efficiency challenge," in *Proc. Opt. Fiber Commun. Conf. (OFC)*, Mar. 2024, pp. 1–3.
- [38] W. J. Turner et al., "Leveraging micro-bump pitch scaling to accelerate interposer link bandwidths for future high-performance compute applications," in *Proc. IEEE Custom Integr. Circuits Conf. (CICC)*, Apr. 2024, pp. 1–7.
- [39] L. Chang, S. Liu, and J. E. Bowers, "Integrated optical frequency comb technologies," *Nature Photon.*, vol. 16, no. 2, pp. 95–108, Feb. 2022.
- [40] H. Hu and L. K. Oxenløwe, "Chip-based optical frequency combs for high-capacity optical communications," *Nanophotonics*, vol. 10, no. 5, pp. 1367–1385, Mar. 2021.
- [41] A. L. Gaeta, M. Lipson, and T. J. Kippenberg, "Photonic-chip-based frequency combs," *Nature Photon.*, vol. 13, no. 3, pp. 158–169, Mar. 2019.
- [42] T. Fortier and E. Baumann, "20 years of developments in optical frequency comb technology and applications," *Commun. Phys.*, vol. 2, no. 1, pp. 1–16, Dec. 2019.
- [43] A. Rizzo et al., "Massively scalable Kerr comb-driven silicon photonic link," *Nature Photon.*, vol. 17, no. 9, pp. 781–790, Jun. 2023.
- [44] Y. Okawachi, B. Y. Kim, M. Lipson, and A. L. Gaeta, "Chip-scale frequency combs for data communications in computing systems," *Optica*, vol. 10, no. 8, pp. 977–995, Aug. 2023.
- [45] H. Shu et al., "Microcomb-driven silicon photonic systems," *Nature*, vol. 605, no. 7910, pp. 457–463, May 2022.
- [46] D. Kong et al., "Intra-datacenter interconnects with a serialized silicon optical frequency comb modulator," *J. Lightw. Technol.*, vol. 38, no. 17, pp. 4677–4682, Sep. 1, 2020.
- [47] C.-H. Chen et al., "A comb laser-driven DWDM silicon photonic transmitter based on microring modulators," *Opt. Exp.*, vol. 23, no. 16, p. 21541, Aug. 2015.
- [48] J. S. Levy, A. Gondarenko, M. A. Foster, A. C. Turner-Foster, A. L. Gaeta, and M. Lipson, "CMOS-compatible multiple-wavelength oscillator for on-chip optical interconnects," *Nature Photon.*, vol. 4, no. 1, pp. 37–40, Jan. 2010.
- [49] T. J. Kippenberg, A. L. Gaeta, M. Lipson, and M. L. Gorodetsky, "Dissipative Kerr solitons in optical microresonators," *Science*, vol. 361, no. 6402, Aug. 2018, Art. no. eaan8083.
- [50] B. Y. Kim et al., "Turn-key, high-efficiency Kerr comb source," *Opt. Lett.*, vol. 44, no. 18, p. 4475, Sep. 2019.
- [51] X. Xue, P. Wang, Y. Xuan, M. Qi, and A. M. Weiner, "Microresonator Kerr frequency combs with high conversion efficiency," *Laser Photon. Rev.*, vol. 11, no. 1, Jan. 2017, Art. no. 1600276.
- [52] X. Xue et al., "Mode-locked dark pulse Kerr combs in normal-dispersion microresonators," *Nature Photon.*, vol. 9, no. 9, pp. 594–600, Sep. 2015.
- [53] Y. Liu et al., "Investigation of mode coupling in normal-dispersion silicon nitride microresonators for Kerr frequency comb generation," *Optica*, vol. 1, no. 3, p. 137, Sep. 2014.
- [54] A. James et al., "Scaling comb-driven resonator-based DWDM silicon photonic links to multi-Tb/s in the multi-FSR regime," *Optica*, vol. 10, no. 7, pp. 832–840, Jul. 2023.
- [55] A. Novick et al., "High-bandwidth density silicon photonic resonators for energy-efficient optical interconnects," *Appl. Phys. Rev.*, vol. 10, no. 4, Dec. 2023, Art. no. 041306.
- [56] L.-W. Luo et al., "High bandwidth on-chip silicon photonic interleaver," *Opt. Exp.*, vol. 18, no. 22, p. 23079, Oct. 2010.
- [57] A. Rizzo, Q. Cheng, S. Daudlin, and K. Bergman, "Ultra-broadband interleaver for extreme wavelength scaling in silicon photonic links," *IEEE Photon. Technol. Lett.*, vol. 33, no. 1, pp. 55–58, Jan. 1, 2021.
- [58] F. Horst, W. M. J. Green, S. Assefa, S. M. Shank, Y. A. Vlasov, and B. J. Offrein, "Cascaded Mach-Zehnder wavelength filters in silicon photonics for low loss and flat pass-band WDM (de-)multiplexing," *Opt. Exp.*, vol. 21, no. 10, p. 11652, May 2013.
- [59] T. Akiyama et al., "Cascaded AMZ triplets: A class of demultiplexers having a monitor and control scheme enabling dense WDM on Si nanowaveguide PICs with ultralow crosstalk and high spectral efficiency," *Opt. Exp.*, vol. 29, no. 6, p. 7966, Mar. 2021.
- [60] S. Wang, Y. Wang, X. Meng, K. Hosseini, T. T. Hoang, and K. Bergman, "Automated tuning of ring-assisted MZI-based interleaver for DWDM systems," in *Proc. Opt. Fiber Commun. Conf. (OFC)*, San Diego, CA, USA, Mar. 2024, pp. 1–3.
- [61] A. Biberman, E. Timurdogan, W. A. Zortman, D. C. Trotter, and M. R. Watts, "Adiabatic microring modulators," *Opt. Exp.*, vol. 20, no. 28, pp. 29223–29236, Dec. 2012.
- [62] M. Gehl et al., "Operation of high-speed silicon photonic micro-disk modulators at cryogenic temperatures," *Optica*, vol. 4, no. 3, p. 374, Mar. 2017.
- [63] A. Novick, S. Wang, A. Rizzo, V. Gopal, and K. Bergman, "Ultra-efficient interleaved vertical-junction microdisk modulator with integrated heater," in *Proc. Opt. Fiber Commun. Conf. (OFC)*, Mar. 2024, pp. 1–3.
- [64] L. Alloatti, D. Cheian, and R. J. Ram, "High-speed modulator with interleaved junctions in zero-change CMOS photonics," *Appl. Phys. Lett.*, vol. 108, no. 13, Mar. 2016, Art. no. 131101.
- [65] M. Cullen et al., "Ultra-wide FSR vertical-junction microdisk modulator with efficient external heater," in *Proc. CLEO*, May 2024, pp. 1–2.
- [66] A. Rizzo et al., "Ultra-efficient foundry-fabricated resonant modulators with thermal undercut," in *Proc. CLEO*, May 2023, pp. 1–2.
- [67] S. Daudlin et al., "Ultra-dense 3D integrated 5.3 Tb/mm<sup>2</sup> 80 micro-disk modulator transmitter," in *Proc. Opt. Fiber Commun. Conf. Exhib. (OFC)*, Mar. 2023, pp. 1–3.
- [68] G. B. Morrison et al., "High power single mode photonic integration," in *Proc. IEEE High Power Diode Lasers Syst. Conf. (HPD)*, Oct. 2019, pp. 47–48.
- [69] D. A. B. Miller, "Energy consumption in optical modulators for interconnects," *Opt. Exp.*, vol. 20, no. S2, p. A293, Mar. 2012.
- [70] B. Razavi, "The design of a transimpedance amplifier [the analog mind]," *IEEE Solid State Circuits Mag.*, vol. 15, no. 1, pp. 7–11, Winter 2023.
- [71] C. S. Levy et al., "8-λ × 50 Gbps/λ heterogeneously integrated Si-Ph DWDM transmitter," *IEEE J. Solid-State Circuits*, vol. 59, no. 3, pp. 690–701, Mar. 2024.
- [72] P. Yan et al., "A 25-Gb/s 3-D direct bond silicon photonic receiver in 12-nm FinFET," *IEEE Solid-State Circuits Lett.*, vol. 7, pp. 34–37, 2024.