

# 3D photonics for ultra-low energy, high bandwidth-density chip data links

Stuart Daudlin<sup>1</sup>, Anthony Rizzo<sup>1,2</sup>, Sunwoo Lee<sup>3</sup>, Devesh Khilwani<sup>3</sup>, Christine Ou<sup>3</sup>, Songli Wang<sup>1</sup>, Asher Novick<sup>1</sup>, Vignesh Gopal<sup>1</sup>, Michael Cullen<sup>1</sup>, Robert Parsons<sup>1</sup>, Alyosha Molnar<sup>3</sup>, and Keren Bergman<sup>1,\*</sup>

<sup>1</sup>*Department of Electrical Engineering, Columbia University, New York, NY 10027*

<sup>2</sup>*Air Force Research Laboratory Information Directorate, Rome, NY 13441*

<sup>3</sup>*Department of Electrical and Computer Engineering, Cornell University, Ithaca, NY 14853*

Artificial intelligence (AI) hardware is positioned to unlock revolutionary computational abilities across diverse fields ranging from fundamental science [1] to medicine [2] and environmental science [3] by leveraging advanced semiconductor chips interconnected in vast distributed networks. However, AI chip development has far outpaced that of the networks that connect them, as chip computation speeds have accelerated a thousandfold faster than communication bandwidth over the last two decades [4, 5]. This gap is the largest barrier for scaling AI performance [6, 7] and results from the disproportionately high energy expended to transmit data [8], which is two orders of magnitude more intensive than computing [9]. Here, we show a leveling of this long-standing discrepancy and achieve the lowest energy optical data link to date through dense 3D integration of photonic and electronic chips. At 120 fJ of consumed energy per communicated bit and 5.3 Tb/s bandwidth per square millimeter of chip area, our platform simultaneously achieves a twofold improvement in both energy consumption and bandwidth density relative to prior demonstrations [10, 11]. These improvements are realized through employing massively parallel 80 channel microresonator-based transmitter and receiver arrays operating at 10 Gb/s per channel, occupying a combined chip footprint of only 0.32 mm<sup>2</sup>. Furthermore, commercial complementary metal-oxide-semiconductor (CMOS) foundries fabricate both the electronic and photonic chips on 300 mm wafers, providing a clear avenue to volume scaling. Through these demonstrated ultra-energy efficient, high bandwidth data communication links, this work eliminates the bandwidth bottleneck between spatially distanced compute nodes and will enable a fundamentally new scale of future AI computing hardware without constraints on data locality.

Light, as a medium for communication, has the unique ability to transmit volumes of data with minimal energy loss. This capability not only sparked the revolution of internet-based communication over fibre optic networks, but also holds the potential to significantly expand computing power beyond current capabilities. Specifically, artificial intelligence (AI) is poised to dramatically transform the computational landscape if provided with more efficient data communication between nodes in computer

networks [6, 12]. A critical bottleneck to the full implementation of light-based communication is the conversion of electrical data from inside a computer chip to optical data. At present, data is stored densely in these semiconductor chips in compute nodes, but is sent out of the chip through centimeter-long electrical wires before finally interfacing with optical transmitters in the form of pluggable optical transceivers. The design of these electrical channels results in slow data rates that are not scalable without accounting for a substantial amount of energy consumption [8]. To overcome this bottleneck, electrical channels must be condensed and converted into optical signals within a compact area.

Previously, intensive efforts have produced chip-scale transmitters and receivers (transceivers) towards this goal but have been marked by a lack of efficiency or scale. These works build on the field of integrated photonics, a technology that aggregates a multitude of optical components onto a single integrated chip. In particular, silicon is highly appealing as a material platform for integrated photonics since it can leverage the tremendous investment in the complementary metal-oxide-semiconductor (CMOS) infrastructure used to fabricate microelectronics chips [13]. The silicon photonics technology platform includes devices such as micro-resonator-based modulators [14, 15], filters, and germanium photodiodes [16] that are compact, efficient in their electrical-to-optical and optical-to-electrical conversions, and scalable to many wavelength channels [17, 18]. To date, the largest of these systems is comprised of 64 channels of photonics and electronics on a single chip and achieves 240 femtojoules per communicated bit (fJ/bit) by the transmitter [19–21]. However, this system has receiver energy consumption above 1000 fJ/bit and has a limited density from the lateral arrangement of photonics and electronics on the same two-dimensional chip. While this monolithic integration of CMOS transistors alongside photonic devices on the same chip may appear highly appealing [22, 23], this configuration ‘freezes’ transistors at a given node size and thus cannot benefit from the further energy efficiency, size, and speed gains of moving to more advanced CMOS nodes. Alternatively, three-dimensional (3D) integration combines a more efficient, leading edge CMOS node electronic chip and a separate photonic chip to improve on these limitations. Ongoing 3D efforts have demonstrated sub-200 fJ/bit powers from transmitters [10, 11, 24, 25] and receivers [10, 11, 26], but the chip-to-chip bond spacings are either significantly larger than the devices themselves [10, 24–26] or rely on emerging



**FIG. 1. 3D integrated photonic-electronic transceiver.** **a**, Illustration of the 3D integrated photonic-electronic system combining arrays of electronic transmitter and receiver cells with arrays of photonic devices. **b**, Cross-sectional diagram of the electronic and photonic chips and their associated material stacks. Both chips consist of a crystalline silicon (c-Si) substrate, doped silicon devices, and metal interconnection layers. **c**, A scanning electron microscope (SEM) image of the cross section of the flip-chip bonded electronic and photonic chips. **d**, Image of the wire-bonded transceiver die-bonded to a printed circuit board and optically coupled to a fibre array with a U.S. dime for scale. **e**, Microscope images of the standalone and 3D integrated chips with the larger photonic chip beneath the flipped electronic chip. The active photonic circuits occupy the area outlined in blue, while the rest of the photonic chip area is used to fan out the optical/electrical lanes for fibre coupling and wire bonding.

hybrid bonding technology [11]. Furthermore, 3D integrated transceivers have not scaled to more than eight channels [24] and have yet to achieve both transmitter and receiver powers below 100 fJ/bit.

Here, we present the most energy-efficient conver-

sion between electrical data and optical data and the highest density of data transmission from an integrated chip-scale system. Our novel approach to photonic transceivers simultaneously addresses energy efficiency and bandwidth density scaling for future computing sys-

tems. This transceiver demonstration is scaled up to 80 channels while having a scaled-down energy consumption through low-capacitance connections between low-capacitance photonics and co-designed CMOS electronic circuits. The data signaling rate per channel is relatively low at 10 gigabits per second (Gb/s) per channel, which permits the receiver electronics to operate in an ideal regime for minimized energy consumption (Methods). The large array of channels compensates for the low per channel data rates and delivers a high aggregate data rate of 800 Gb/s in a compact area of only 0.32 mm<sup>2</sup>. From the perspective of interfacing to a processor, a large array of low data rate channels relaxes signal processing and time multiplexing of the low data rate streams native to the processor [9, 27, 28]. Furthermore, wavelength-division multiplexing (WDM) sources for these numerous data streams are becoming readily available with the advent of chip-scale micro-combs [17, 29]. Our demonstration unlocks the tremendous potential of light as a high bandwidth and energy-efficient inter-chip communication medium, offering an immediate solution to the pressing challenge of AI scaling.

## RESULTS

We implement this high-density transceiver using compact photonic devices and dense, 28-nm node co-designed electronic circuits; however, the total density ultimately depends on the 3D bond spacings. To address this, we employ a high-density bonding process using copper pillar bumps. An electroplating process is used to form bumps on the photonic chip with copper pedestals capped with a layer of tin. The copper-tin bumps are then bonded to a nickel-plated electronic chip under a thermal and compressive force bringing the chips together. Figure 1b illustrates the layers on the electronic chip, photonic chip, and the bonding metals. We push the limits of this bonding technology by using a 15  $\mu\text{m}$  spacing and 10  $\mu\text{m}$  bump diameters (25  $\mu\text{m}$  pitch) in an array of 2,304 bonds. This process balances two potential failure modes for such close spacing: excessive tin causing flow and electrical shorting to adjacent bonds during bonding, and insufficient tin leading to brittle bonds [30]. We test our bonding process using cross-sectional scanning electron micrographs of the bonds, shown in Figure 1c, and by measuring the force needed to separate the bonded chips. The cross-section analysis reveals that tin does not flow to adjacent bonds, while the shear test demonstrates a robust 2.1 kg (114.9 MPa) force required to separate the bonded dies. Modeling and measurements show each pair of bonds (for a signal and ground) has a 10 fF capacitance (Methods). Figure 1d shows the assembled transceiver wire-bonded to a printed circuit board and optically coupled to a fibre array (Methods), while Figure 1e shows microscope images of the face-up electronic and photonic chips, and the face-down electronic chip flip-chip bonded on top of the larger photonic

chip. This bonding technique provides an ideal platform to achieve the required density for chip-to-chip data communication links.

The 3D integrated chip contains an array of 80 transmitter cells and 80 receiver cells; these cells are organized into 20 waveguide buses with four wavelength channels per bus. Each transmitter cell has a local memory in the electronic chip that stores a pseudo-random bit sequence. A periodic clock signal triggers the electronics, and the transmitter cell electronics send out the programmed bit sequences as voltage pulses incident on the photonic modulator electrodes. These voltage pulses blue-shift the micro-disk resonance from a blocking to a non-blocking state and thus modulate an on-resonance laser line. Figure 2a illustrates the transmitter experiment and Figure 2b shows a schematic of the transmitter cell, while Figure 2c shows the spectrum of the modulator bus with four micro-disk resonances. After the transmitter characterization, we test the receivers, which function similarly: in each receiver cell, wavelength channels carry signals on the photonic chip and micro-rings selectively drop wavelengths onto each respective photodiode. The electronic chip then amplifies the photocurrent generated by the photodiode and writes the data into the local memory of each receiver cell, as illustrated in Figure 3a. Figure 3b shows a schematic of the receiver cell. For performance characterization, an on-chip circuit compares the receiver memory to expected data and keeps an error count that is periodically read out of the chip. This architecture of transmitters and receivers fills the array of channels in the dense area provided by the bonding process.

The transmitter cell within the 80-cell array consumes 50 fJ/bit when driving the micro-disks with a 1 V swing. This power is dynamic, equal to  $\frac{1}{4}CV^2$ , where C is the capacitance being charged or discharged during a bit transition, and V is the charged-to-discharged voltage [31]. The vertical p-n junction micro-disk enables a low voltage drive by featuring a higher overlap of the p-n depletion region and the optical whispering gallery mode of the disk compared to lateral junctions [15], and results in an electrical-to-optical response of 75 pm resonance shift per applied volt (Fig. 2d). We further characterize this response with the dynamic insertion loss (IL, the power of a '1' bit divided by the power before the modulator) and extinction ratio (ER, the power of a '1' bit divided by the power of a '0' bit). Figure 2e shows these metrics, captured from the modulated signal output of a transmitter cell driven at 1.5 V. In this measurement, the laser wavelength moves into the shifting resonance and the optical modulation amplitude (OMA, the power of '1' bit minus the power of '0' bit) of the output signal increases, reaching a maximum at 2.5 dB IL and 4 dB ER. Figure 2f shows ER and IL at maximum OMA for driver voltages between 1 and 1.5V. These high ERs and low ILs per volt enable a reduced V in  $\frac{1}{4}CV^2$ . Capacitance sources include the micro-disk p-n junction (128 fF), bond pads (10 fF), and capacitances within the driver circuit (61 fF), combining for a total expected ca-



**FIG. 2. Transmitter characterization and performance.** **a**, Illustration of the transmitter experimental test setup showing a single wavelength laser channel modulated by the transmitter and measured on an oscilloscope. **b**, Transmitter cell circuit schematic. **c**, Optical spectrum of the four channel transmitter bus. **d**, Resonance shift of a representative micro-disk as a function of reverse-bias voltage applied to the vertical p-n junction. **e**, Transmitted signal extinction ratio (ER, power of ‘1’ bit divided by power of ‘0’ bit), insertion loss (IL, power of ‘1’ bit divided by power before the modulator), and normalized optical modulation amplitude (OMA, normalized power of ‘1’ bit minus power of ‘0’ bit) with a 1.5 V driver voltage. **f**, ER and IL at maximum OMA for a range of driver voltages. **g**, Measured energy consumption of the transmitter array for a range of driver voltages and a  $\frac{1}{4}CV^2$  fit, where C is the capacitance charged by the driver voltage V. **h**, Bit error ratio (BER) measurement of the modulated signal input to a commercial receiver at 1 V, 1.25 V, and 1.5 V driver voltages; received power is the average signal power at the commercial receiver. **i**, Eye diagrams for all 80 modulators on the photonic chip at 10 Gb/s/modulator and 1 dBm input laser power.



**FIG. 3. Receiver characterization and performance.** **a**, Illustration of the receiver test setup showing a laser line modulated by a commercial transmitter and received by the 3D integrated photonic-electronic receiver. **b**, Receiver cell circuit schematic. **c**, Responsivity (light to electrical current conversion efficiency) measurement of the photodiode. **d**, Optical spectrum of receiver bus. **e**, Bit error ratio (BER) test of a receiver cell using a commercial transmitter signal; received power is the average signal power at the photodiode.

pacitance of 199 fF (Methods). These capacitances exhibit low values through micro-disk compactness, miniaturized bonds, and careful design in the 28 nm electronic chip technology. Figure 2g shows the transmitter energy consumption as all 80 modulators are transmitting data with drive voltages ranging between 1 V to 1.5 V. The  $\frac{1}{4}CV^2$  model fits 198 fF total capacitance per cell using this data, aligning closely with the expected 199 fF from the independently measured and modeled devices. Next, we record eye diagrams for each of the 80 transmitters on the chip with a drive at 10 Gb/s/transmitter and 1 dBm laser power before the modulator (Fig. 2i). As every transmitter modulates, we measure current and voltage on the transmitter power supply for the previous energy consumption (Fig. 2g). With no optical amplification, the oscilloscope receiver is the limiting factor of the eye qualities with an 8 uW input-referred root mean square noise (denoted as input-referred noise here on). All 80 eye diagrams in the array are open and uniform, which confirms the high-yield of the bonding process and validates our many-channel approach. As a further confirmation of transmitter signal quality, a bit error ratio (BER) test with a reference receiver demonstrates error-free performance ( $BER < 10^{-12}$ ) down to a receiver noise-determined power for 1 V, 1.25 V, and 1.5 V modulator drives (Fig. 2h). In the following section, our on-chip receiver shows a dramatically reduced input-referred noise of 480 nW. The 80 channel photonic transmitter array outputs an aggregate data rate of 800 Gb/s and occupies an area of 0.15 mm<sup>2</sup>, demonstrating an unprecedented bandwidth density of 5.3 Tb/s/mm<sup>2</sup>.

The receiver cell consumes 70 fJ/bit when receiving a 10 Gb/s signal at -24.85 dBm average power with a  $4 \times 10^{-10}$  BER. The receiver spends energy as a static bi-

asing of the electronic amplifier. The photodiode is a vertical p-silicon, i-germanium, n-germanium diode that efficiently converts optical signals to electrical current with an efficiency of 1 A/W (Fig. 3c). The capacitance of this photodiode is crucial since receiver noise is proportional to the amplifier input capacitance and is compensated by static biasing power (Methods). Minimizing this noise is critical for reducing laser power sourced into the link and improving energy efficiency. With a measured photodiode capacitance of 17 fF and pad capacitances of 10 fF, the simulated input-referred noise is 300 nW. A BER test is used to evaluate this performance; a signal from an ideal modulator is used to send a 10 Gb/s data stream into the chip from which on-chip circuits measure errors in the received bits. On the photonic chip, a ring resonator on a four-channel bus filters the modulated signal to a photodiode. Figure 3d shows the four-channel bus spectrum. We then gradually reduce the signal power while counting errors on the electronic chip, obtaining the BER curve in Figure 3e. This test reveals that the receiver has a sensitivity of -24.85 dBm for a  $4 \times 10^{-10}$  BER, resulting in a measured input-referred noise of 480 nW using the 19 dB ER signal. We record the power of the receiver from its power supply at -24.85 dBm input optical power with 70 fJ/bit consumption. This result, along with the transmitter cell performance, demonstrates that both the receiver and transmitter consume less than 100 fJ/bit while supporting a massive bandwidth transfer of 800 Gb/s from the dense array.

The transmitter and receiver cells independently demonstrate a combined 120 fJ/bit; we next connect them and validate their combined performance. Optical fibre connects two separate transceivers as a complete data communication link, with one transceiver function-



**FIG. 4. Transmitter to receiver data communication link demonstration.** **a**, Illustration of the transmitter to receiver link showing a 3D integrated photonic-electronic transmitter modulating four laser lines and a separate photonic-electronic receiver converting the four data channels back to the electrical domain. **b**, Spectrum of the link laser source. **c**, Eye diagrams of the four channels after the transmitter. **d**, Bit error ratios (BERs) of the data channels after the receiver.

ing as a transmitter and the other as a receiver (Fig. 4a). A shared clock synchronizes the two electronic chips, and programmable clock delays in each receiver cell align the transmitted data with the receiver sampling point. A laser diode array provides four wavelength channels at -5 dBm power per channel, which feed into a bus on the transmitter chip. Figure 4b shows the spectrum of these laser channels. Individual transmitter cells simultaneously modulate each wavelength at 8 Gb/s at a 1.5 V drive, resulting in open eye diagrams for each channel (Fig. 4c). The signal powers are too low for detection by the diagnostic oscilloscope receiver, so we amplify the signal before the oscilloscope and normalize the eye diagrams. However, the signals do not require amplification for the electronic-photonics receiver due to its high sensitivity. The average power per channel at each receiver photodiode is -19.5 dBm. On chip error counters record errors in each channel in one-minute intervals, revealing a maximum recorded BER of  $6 \times 10^{-8}$  and a minimum count of no errors in the interval, denoted as  $10^{-12}$ , in Figure 4d. This result shows that the transmitters and receivers within the 3D integration can form a complete low-power, high-bandwidth link needed for next-generation computing systems.

## DISCUSSION

Integrated photonic chips present a promising low-power platform to address the data transfer demands of AI computing. Here, we realized this promise by demonstrating a scaled-up array of 80 channels on a single electronic-photonics, densely 3D integrated transceiver. This multi-chip module consumes minimal energy by virtue of the large number of channels, cutting-edge low-capacitance bonding technology, co-designed electronic-photonics circuits, and advanced devices used. While our demonstrated system achieves record performance in terms of energy efficiency and bandwidth density, the performance can be further improved in future implementations. Although the micro-disks used in this demonstration exhibit high performance, resonant modulators can be developed with lower capacitance [15, 32] and a higher electro-optical response [15, 33], both of which would decrease the dynamic power of the transmitter. Similarly, on the receiver side, lower capacitance photodiodes [34, 35] could reduce the power and noise of the receiver architecture. However, miniaturizing photodiodes requires considering a loss of responsivity [36], presenting complex link-level tradeoffs. Additionally, the energy consumption of the electronic circuits can be further reduced by moving to a more advanced CMOS node. While

our demonstrated bonding technology is approaching the limit of how closely spaced tin bonds can be made, further density scaling could be realized through the development of hybrid bonding [11]. However, after achieving the low capacitance value of the bonds demonstrated in this work, pursuing a further reduction in bond capacitance would yield diminishing returns in terms of energy efficiency.

Reduced chip-to-fibre optical losses can improve the loss budget of our demonstrated link, and laser powers may be as low as 47 fJ/bit with a distributed feedback laser and 30 fJ/bit for a scalable, high channel count comb laser (Supplementary Note 1). Furthermore, silicon resonators are sensitive to temperature and fabrication variation, requiring thermal control circuits [37, 38] and, for minimized power, reduced heat leakage into the environment using methods such as a silicon substrate removal around resonators [25, 39, 40]. Detailed wafer-scale resonance variation data and approximate thermal energy contributions across a range of scenarios provide a best case thermal energy consumption of 71 fJ/bit (Supplementary Note 2). Additionally, the photonic circuits are highly polarization-sensitive and require polarization-maintaining fibre or the addition of polarization management circuits [36, 41, 42]. Finally, while we demonstrate high bandwidth density, a higher per-fibre bandwidth and photonic chip-edge bandwidth density can be achieved with wavelength scaling by cascading more arrayed channels onto fewer waveguide buses [17, 18]. This architecture can combine with chip-scale frequency combs to generate hundreds of wavelength channels [17].

While the potential impact of this technology is evident for the advancement of energy-efficient AI computing, its use may extend to far reaching applications. These low-power, massively parallel optical links could enable pervasive device connectivity and transform computing by streamlining resource allocation through optically-linked, disaggregated, and reconfigurable computing and memory resources [28, 43–45], revolutionizing the computing landscape over the next decade and beyond.

## METHODS

**Transceiver assembly.** Separate CMOS foundries are used fabricate the electronic and photonic chips. The photonic chips were fabricated through the American Institute for Manufacturing Integrated Photonics (AIM Photonics) on a custom 300 mm silicon-on-insulator wafer. The AIM Photonics process design kit (PDK) includes the micro-disks, ring filters, and photodiodes [46]. The electronic chips were fabricated through Taiwan Semiconductor Manufacturing Company Limited (TSMC) on a shared multi-project wafer in a 28 nm CMOS process node. Both chips feature a square array of 2,304 aluminium pads at a 25  $\mu\text{m}$  pitch on their top surfaces. These pads connect to lower metal layers within each chip and the devices on the silicon layer.

The photonic and electronic chips are then processed post-fabrication before bonding their pad arrays together. In this step, we core the 300 mm photonic wafer to a 200 mm wafer and a wafer-level process is used to bump its pads with electroplated layers of copper and tin. The electronic chips, received as individual 1.6  $\text{mm}^2$  units from a shared wafer, are unsuitable for wafer-level photolithography-based processes. Alternatively, we adopt a chip-level process of electroless nickel plating, followed by an additional layer of immersion gold plating to prevent nickel oxidation. After dicing the bumped photonic wafer into 6.5 mm by 3 mm chips, a thermo-compression bond is used to connect the bumped photonic chips to the plated electronic chips.

To power and operate the transceiver, we create electrical connections to the electronic chip through the bonds to the photonic chip. Metal layers on the photonic chip wire these connections to large electrical pads on an exposed edge of the photonic chip. Wire-bonds connect these pads to a printed circuit board (PCB), which connects to: (i) a micro-controller that programs the electronic chip, (ii) power sources that supply the electronic chip voltage rails, and (iii) a radio-frequency (RF) clock generator for the 5 GHz clock of the electronic chip. This clock line is a coplanar RF waveguide on the PCB with a matched impedance to 50  $\Omega$  RF cables. Optical fibres couple light to waveguide buses through silicon nitride edge couplers; these couplers are on the side of the photonic chip that is opposite the wire-bond pads. A micro-positioner is used to align a standard single mode fibre v-groove array with the edge-couplers, which are spaced at a 127  $\mu\text{m}$  pitch. The assembly procedures of photonic wafer bumping and bonding, electronic chip plating, and wire-bonding are conducted at Micross AIT, CVI, and Cornell University, respectively.

**Capacitance models and measurements.** We identify several sources of capacitance that affect energy efficiency: (i) chip pads, (ii) bump parasitics, (iii) the micro-disk junction, (iv) the photodiode junction, and (v) electronic driver capacitance. Extended Data Figure 1a depicts these capacitance sources. These capacitances are determined through electrostatic simulations, circuit model simulations, and empirical measurements. Focusing initially on the bumps and photonic chip pads, an electrostatic solver (Ansys Maxwell) is used to simulate the photonic chip pad-to-substrate capacitance (4 fF) and bump-to-bump parasitic capacitance (< 1 fF). Similarly, we simulate an electronic-chip pad model with extracted parasitic capacitances (Cadence Virtuoso), yielding an electronic chip pad capacitance of 6 fF. This extracted circuit model simulation also results in an effective 61 fF capacitance inside the electronic driver.

Experimental methods are used determine the capacitances of the micro-disk and photodiode junctions, along with a validation of the simulated pad capacitance. A vector-network analyzer (VNA, Keysight P5007A) is

used to record an electrical RF reflection from probed devices. We fit the magnitude and phase of the reflected wave across RF frequencies to a reflection from a lumped complex impedance. Extended Data Figure 1b shows the imaginary impedances of the devices and their associated fitted capacitor impedances. During these measurements, an RF bias tee provides a DC reverse bias voltage to the device. An electronic calibration module (Keysight N4693D) is used to calibrate out the VNA response of the RF cable up to the output of the bias tee. The bias tee connects to a  $25\ \mu\text{m}$  pitch RF probe (Form-Factor InfinityXT) and this probe response is calibrated out of the results by de-embedding its unlanded response as an electrical open. After this calibration, measurements of the photonic pad and bumped photonic pad show capacitances of  $3\ \text{fF}$  and  $4\ \text{fF}$ , respectively. Subsequent measurements are used to first de-embed these pad responses and then measure the photodiode and micro-disk capacitances. The measured photodiode capacitance is  $17\ \text{fF}$ .

As anticipated from a p-n junction, the micro-disk capacitance decreases with an increasing reverse bias. Extended Data Figure 1c shows the measured capacitance functions of the four micro-disks. The energy spent per bit transition is the integral of this capacitance function weighted by the difference of supply voltage and output voltage [31]. A midpoint Riemann sum of the micro-disk capacitances, weighted by reverse bias voltage subtracted from  $1\ \text{V}$ , between  $0$  and  $1\ \text{V}$  bias voltage, yields an effective disk junction capacitance for dynamic energy consumption of  $128\ \text{fF}$  (averaged across the four micro-disks). A summation of the pad, driver, and junction capacitances results in a  $199\ \text{fF}$  transmitter capacitance. This result is in excellent agreement with the capacitance directly measured from the energy consumption of the transmitter ( $198\ \text{fF}$ ), validating the capacitance models and measurements.

**Transmitter characterization.** Each transmitter result is experimentally measured from the 3D integrated photonic-electronic chip. An exception is the DC electro-optic response measurement, for which a separate photonic chip with the same modulator design is used. We apply a voltage to a probe to reverse bias the modulator at varying DC voltages, and use an optical spectrum analyzer (Keysight 8164B) to record each response. The remainder of the transmitter, receiver, and link characterizations employ an optical switch (Polatis 1000n 24x10). This switch optically connects equipment and devices-under-test and minimizes fibre mating cycles. This approach streamlines the measurement process and eliminates potential power discrepancies that might stem from fibre mating inconsistencies. An optical spectrum analyzer (Aragon BOSA 400) is used to measure the transmitter bus spectrum. For dynamic data transmission, a micro-controller is used to write a different PRBS6 pattern into the 64-bit registers in each of the 80 transmitter cells. Next, all modulators transmit

this data simultaneously as the electronic chip is clocked, and all data registers are driven out of the chip to their respective micro-disks. The 64-bit pattern transmitted by each modulator repeats indefinitely as the chip clock is running. In this state, we record the eye diagrams of each modulator, dynamic characteristics of the modulators, and transmitter energy consumption from the electronic driver array voltage rail. A narrow linewidth tunable laser (Santec TSL-210) is used as the light source in these measurements. Laser light travels through a fibre polarization controller and then into the chip. An oscilloscope (Tektronix DSA8300 with an 80C01 Optical Sampling Module) is used to receive modulated light for dynamic characterization and eye diagrams. In the bit error ratio test, modulated light initially passes through a variable optical attenuator (VOA) before reaching a commercial receiver (Thorlabs RXM40AF). The commercial receiver converts the optical signals into electrical signals that are read by a bit error ratio tester (BERT, Anritsu MU195040A). We sweep the received optical power with the VOA and record errors from the BERT to construct the transmitter BER curves.

**Receiver characterization.** An ideal modulation source and an on-chip bit-error checker circuit are used to characterize the receiver cell performance. Separately, a tunable laser (Keysight 8164B) and a DC electrical probe landed on photodiode pads are used to measure the photodiode responsivity; the probe applies a reverse bias voltage and senses photocurrent from a known input laser power. Next, for the dynamic characterization, we use an ideal modulation source (Thorlabs MX35E) consisting of an internal laser and a lithium niobate Mach-Zehnder modulator. A pulsed pattern generator (PPG, Anritsu MU195020A) is used to drive the modulator with a repeating 64-bit PRBS6 pattern. The signal travels through fibre and a polarization controller before coupling into the photonic chip. Voltage is applied to a doped-silicon resistor adjacent to the ring filter to generate heat and tune the ring filter resonance to the desired wavelength channel. The ring resonator drops the signal to a photodiode, which converts it from light into photocurrent for the electronic chip to then amplify. Timing circuits continuously write the received bits into a 64-bit long memory in a cycle. For timing, a programmable timing offset circuit in the electronic chip and a timing offset of the PPG align the incoming data to the receiver sampling point. A split clock source synchronizes the receiver chip and PPG clock frequencies. As the final step, readouts from the serial programming port display the saved received bits and confirm data reception. However, the serial port cannot update fast enough to give a bit error ratio in a short time frame. Instead, an on-chip error-counter circuit in each cell compares the received memory with pre-programmed expected bits and, if there is a discrepancy, it adds an error to an on-going count. Readouts from the serial port display this count and we obtain a BER curve as

we sweep signal power using a VOA inside the ideal modulation source.

**Link demonstration.** The link demonstration combines the previously described experimental setups of the transmitter and receiver. A microcontroller sets the sent bits in a transmitter chip and a second microcontroller reads the received bits in a separate receiver chip. Four channels of data transmit through the link at 8 Gb/s/channel and serial port readouts from the receiver record this data, along with an on-going error count for each channel. A shared clock signal synchronizes the two transceivers and a programmable delay block in each receiver cell delays the receiver sampling point with respect to the transmitter clock. A distributed feedback laser array (Thorlabs PRO8) is used as the four channel optical source for the link. An arrayed waveguide grating multiplexes these different wavelengths of light from the laser array onto a single fibre. We place polarization controllers before the transmitter chip and before the receiver chip to ensure optimal coupling into the fundamental transverse electric (TE) mode of each waveguide. The optical switch is used to direct light from the transmitter to an erbium-doped fibre amplifier (EDFA) and an oscilloscope for each eye diagram; the switch then directs light back to the 3D integrated receiver for BER measurements without amplification. Optical losses in this link amount to 14.5 dB. These losses are from several sources: three chip-to-fibre interfaces at 3 dB each account for 9 dB, the modulation insertion loss is 2.5 dB, a modulation penalty accounts for 1.5 dB (this is the difference between the optical power of a ‘1’ bit and the average power), and an extra 1.5 dB of power is lost through the optical switch and fibre connectors throughout the link.

**Electronic circuit architectures.** Transmitted data starts as bits in the memory of each transmitter cell. Timing circuits running on a 5 GHz input clock (half transmitted data rate) generate memory read addresses and two-to-one multiplexer select signals. Circuits in the data path operate at a voltage supply of 1 V, except for the driver, which operates between 1 and 1.5 V. Extended Data Figure 3a shows the driver circuit. Inside of the design, high speed 1 V transistors in a cascode configuration prevents transistor junction breakdown from supply voltages exceeding 1 V. The main driver branch (M5-M8) has wide transistors to reduce the switching delay on modulator capacitance ( $C_{\text{microdisk}} = 128 \text{ fF}$ ). A capacitor ( $C_{\text{coupling}} = 183 \text{ fF}$ ) ensures a high switching speed while the auxiliary branch (M1-M4) holds the DC voltage level.

The receiver circuit senses a modulated photodiode current, amplifies it to digital levels at the supply rail voltage, and de-serializes the signal before writing it into internal memory. Extended Data Figure 3b shows the amplifier circuit, which uses an inverter-based transimpedance amplifier (TIA) as an initial gain stage fol-

lowed by an equalizer and inverters. A programmable current digital-to-analog converter (DAC) at the amplifier input supplies a current (IDAC) that cancels the DC offset of the photodiode current. The TIA stage has a high feedback resistance for a high gain ( $R_f = 18.6 \text{ k}\Omega$ ). This resistor equates to a lower input resistance ( $R_{\text{in}} = 2.1 \text{ k}\Omega$ ) from the Miller Effect, however, it combines with the input capacitance ( $C_{\text{photodiode}} = 17 \text{ fF}$ ,  $C_{\text{pad}} = 10 \text{ fF}$ ) for a low frequency pole. As a remedy, an active inductor circuit in the subsequent equalizer stage cancels out the TIA pole ( $R_{\text{eq}} = 3.1 \text{ k}\Omega$ ,  $C_{\text{eq}} = 33.6 \text{ fF}$ ). After the equalizer, ensuing inverters ensure the output swings between 0 and 1 V. An isolated, 1 V power supply of the receiver amplifiers mitigates supply noise.

The TIA dominates the receiver amplifier energy consumption and its energy per bit is the static biasing power divided by the data rate. However, the TIA design introduces trade-offs between noise, bandwidth, and power. Equation 1 shows how the receiver signal-to-noise ratio, SNR, relates to receiver energy per bit ( $E/\text{bit}_{RX}$ ), input signal ( $I$ ), input capacitance ( $C$ ), and channel bandwidth (BW). Supplementary Note 3 provides a derivation of Equation 1. This relationship sets a boundary on channel data rate scaling. With constant SNR and  $C$ , the design can expand BW with an increase in the input signal. In this context, the energy per bit remains constant, with the growing BW balancing out added laser power. This could imply an indefinite data rate scaling. However, a rise in BW necessitates wider TIA transistors, which subsequently contribute significantly to the input capacitance,  $C$ . This sequence results in a degradation of SNR at the receiver for high BWs, establishing a cap on the energy-efficient per-channel data rate. To achieve higher data rates without compromising energy, the focus should be on parallel data communication across multiple channels. Similar conclusions have been made in other studies, which advocate for parallel channels operating at moderate data rates [9, 27].

$$SNR \sim \left( \frac{I}{BW} \right)^2 \frac{E/\text{bit}_{RX}}{C^2}. \quad (1)$$

## ACKNOWLEDGEMENTS

This work was supported by the U.S. Defense Advanced Research Projects Agency (DARPA) under PIPES Grant HR00111920014 and by the U.S. Advanced Research Projects Agency-Energy (ARPA-E) under EN-LITENED Grant DE-AR000843. S.D acknowledges support by the National Science Foundation (NSF) GRFP under Grant DGE-1644869. We thank G. Keeler for leading the PIPES program, N. Abrams and M. Hattink for helpful discussions, and the engineering teams at AIM/SUNY Poly Photonics, Micross AIT, and CVI for their roles in the transceiver fabrication and assembly.

## AUTHOR CONTRIBUTIONS

S.D. designed the photonic chip and developed the 3D bonding process. S.D. led transceiver testing and photonic device analysis with assistance from A.R., S.W., A.N., and V.G. A.R. compiled chip designs for the custom photonic wafer run. S.W., D.K., C.O., and A.M. designed the electronic chip and conducted bring-up tests of the electronic chip. S.D. and M.C. designed the printed circuit boards. R.P. gathered wafer-level micro-disk fabrication variation data. A.M. and K.B. supervised the project.

## COMPETING INTERESTS

The authors declare no competing interests.

## DATA AVAILABILITY

The data that support the plots within this paper and other findings of this study are available from the corresponding author upon reasonable request.

## CORRESPONDENCE

Correspondence and requests for materials should be addressed to K.B. (email: bergman@ee.columbia.edu).

- 
- [1] Carrasquilla, J. & Melko, R. G. Machine learning phases of matter. *Nature Physics* **13**, 431–434 (2017).
  - [2] He, J. *et al.* The practical implementation of artificial intelligence technologies in medicine. *Nature medicine* **25**, 30–36 (2019).
  - [3] Kaack, L. H. *et al.* Aligning artificial intelligence with climate change mitigation. *Nature Climate Change* **12**, 518–527 (2022).
  - [4] Dally, W. J., Keckler, S. W. & Kirk, D. B. Evolution of the graphics processing unit (gpu). *IEEE Micro* **41**, 42–51 (2021).
  - [5] Mirabbasi, S., Fujino, L. C. & Smith, K. C. Through the looking glass—the 2023 edition: Trends in solid-state circuits from isscc. *IEEE Solid-State Circuits Magazine* **15**, 45–62 (2023).
  - [6] Wang, G. *et al.* Zero++: Extremely efficient collective communication for giant model training. *arXiv preprint arXiv:2306.10209* (2023).
  - [7] Pati, S., Aga, S., Islam, M., Jayasena, N. & Sinclair, M. D. Computation vs. communication scaling for future transformers on future hardware. *arXiv preprint arXiv:2302.02825* (2023).
  - [8] Lee, B. G., Nedovic, N., Greer, T. H. & Gray, C. T. Beyond cpo: A motivation and approach for bringing optics onto the silicon interposer. *Journal of Lightwave Technology* **41**, 1152–1162 (2022).
  - [9] Miller, D. A. Attojoule optoelectronics for low-energy information processing and communications. *Journal of Lightwave Technology* **35**, 346–396 (2017).
  - [10] Rakowski, M. *et al.* Hybrid 14nm finfet-silicon photonics technology for low-power tb/s/mm<sup>2</sup> optical i/o. In *2018 IEEE Symposium on VLSI Technology*, 221–222 (IEEE, 2018).
  - [11] Samanta, A. *et al.* A direct bond interconnect 3d co-integrated silicon-photonics transceiver in 12nm finfet with -20.3 dbm oma sensitivity and 691fj/bit. In *2023 Optical Fiber Communications Conference and Exhibition (OFC)*, 1–3 (IEEE, 2023).
  - [12] Wu, Z. *et al.* Peta-scale embedded photonics architecture for distributed deep learning applications. *Journal of Lightwave Technology* (2023).
  - [13] Hochberg, M. & Baehr-Jones, T. Towards fabless silicon photonics. *Nature photonics* **4**, 492–494 (2010).
  - [14] Xu, Q., Schmidt, B., Pradhan, S. & Lipson, M. Micrometre-scale silicon electro-optic modulator. *nature* **435**, 325–327 (2005).
  - [15] Timurdogan, E. *et al.* An ultralow power athermal silicon modulator. *Nature communications* **5**, 1–11 (2014).
  - [16] Michel, J., Liu, J. & Kimerling, L. C. High-performance ge-on-si photodetectors. *Nature photonics* **4**, 527–534 (2010).
  - [17] Rizzo, A. *et al.* Massively scalable kerr comb-driven silicon photonic link. *Nature Photonics* 1–10 (2023).
  - [18] Rizzo, A. *et al.* Petabit-scale silicon photonic interconnects with integrated kerr frequency combs. *IEEE Journal of Selected Topics in Quantum Electronics* **29**, 1–20 (2022).
  - [19] Wade, M. *et al.* A bandwidth-dense, low power electronic-photonic platform and architecture for multi-tbps optical i/o. In *2018 European Conference on Optical Communication (ECOC)*, 1–3 (IEEE, 2018).
  - [20] Sun, C. *et al.* Teraphy: An o-band wdm electro-optic platform for low power, terabit/s optical i/o. In *2020 IEEE Symposium on VLSI Technology*, 1–2 (IEEE, 2020).
  - [21] Wade, M. *et al.* An error-free 1 tbps wdm optical i/o chiplet and multi-wavelength multi-port laser. In *Optical Fiber Communication Conference*, F3C–6 (Optica Publishing Group, 2021).
  - [22] Sun, C. *et al.* Single-chip microprocessor that communicates directly using light. *Nature* **528**, 534–538 (2015).
  - [23] Atabaki, A. H. *et al.* Integrating photonics with silicon nanoelectronics for the next generation of systems on a chip. *Nature* **556**, 349–354 (2018).
  - [24] Zheng, X. *et al.* Ultralow power 80 gb/s arrayed cmos silicon photonic transceivers for wdm optical links. *Journal of Lightwave Technology* **30**, 641–650 (2011).
  - [25] Ban, Y. *et al.* Highly optimized o-band si ring modulators for low-power hybrid cmos-sipho transceivers. In *2023 Optical Fiber Communications Conference and Exhibition (OFC)*, 1–3 (IEEE, 2023).

- [26] Saeedi, S., Menezo, S., Pares, G. & Emami, A. A 25 gb/s 3d-integrated cmos/silicon-photonic receiver for low-power high-sensitivity optical communication. *Journal of Lightwave Technology* **34**, 2924–2933 (2016).
- [27] Georgas, M., Leu, J., Moss, B., Sun, C. & Stojanović, V. Addressing link-level design tradeoffs for integrated photonic interconnects. In *2011 IEEE Custom Integrated Circuits Conference (CICC)*, 1–8 (IEEE, 2011).
- [28] Zhu, Z. *et al.* Photonic switched optically connected memory: An approach to address memory challenges in deep learning. *Journal of Lightwave Technology* **38**, 2815–2825 (2020).
- [29] Gaeta, A. L., Lipson, M. & Kippenberg, T. J. Photonic-chip-based frequency combs. *nature photonics* **13**, 158–169 (2019).
- [30] Li, Z. *et al.* Scaling solder micro-bump interconnect down to 10 um pitch for advanced 3d ic packages. In *2021 IEEE 71st Electronic Components and Technology Conference (ECTC)*, 451–456 (IEEE, 2021).
- [31] Miller, D. A. Energy consumption in optical modulators for interconnects. *Optics express* **20**, A293–A308 (2012).
- [32] Sun, J. *et al.* A 128 gb/s pam4 silicon microring modulator with integrated thermo-optic resonance tuning. *Journal of Lightwave Technology* **37**, 110–115 (2018).
- [33] Gevorgyan, H., Khilo, A., Wade, M. T., Stojanović, V. M. & Popović, M. A. Miniature, highly sensitive moscap ring modulators in co-optimized electronic-photonic cmos. *Photonics Research* **10**, A1–A7 (2022).
- [34] Lischke, S. *et al.* High bandwidth, high responsivity waveguide-coupled germanium pin photodiode. *Optics express* **23**, 27213–27220 (2015).
- [35] Chen, H. *et al.* - 1 v bias 67 ghz bandwidth si-contacted germanium waveguide pin photodetector for optical links at 56 gbps and beyond. *Optics Express* **24**, 4622–4631 (2016).
- [36] Chrostowski, L. & Hochberg, M. *Silicon photonics design: from devices to systems* (Cambridge University Press, 2015).
- [37] Sun, C. *et al.* A 45 nm cmos-soi monolithic photonics platform with bit-statistics-based resonant microring thermal tuning. *IEEE Journal of Solid-State Circuits* **51**, 893–907 (2016).
- [38] Hattink, M., Dai, L. Y., Zhu, Z. & Bergman, K. Streamlined architecture for thermal control and stabilization of cascaded dwdm micro-ring filters bus. In *Optical Fiber Communication Conference*, W2A–2 (Optica Publishing Group, 2022).
- [39] Masood, A. *et al.* Comparison of heater architectures for thermal control of silicon photonic circuits. In *10th International Conference on Group IV Photonics*, 83–84 (IEEE, 2013).
- [40] Rizzo, A. *et al.* Ultra-efficient foundry-fabricated resonant modulators with thermal undercut. In *CLEO: Science and Innovations*, SF2K–6 (Optica Publishing Group, 2023).
- [41] Ma, Y. *et al.* Symmetrical polarization splitter/rotator design and application in a polarization insensitive wdm receiver. *Optics express* **23**, 16052–16062 (2015).
- [42] Park, A. H., Shoman, H., Ma, M., Shekhar, S. & Chrostowski, L. Ring resonator based polarization diversity wdm receiver. *Optics express* **27**, 6147–6157 (2019).
- [43] Jouppi, N. *et al.* Tpu v4: An optically reconfigurable supercomputer for machine learning with hardware support for embeddings. In *Proceedings of the 50th Annual International Symposium on Computer Architecture*, 1–14 (2023).
- [44] Gonzalez, J. *et al.* Optically connected memory for disaggregated data centers. *Journal of Parallel and Distributed Computing* **163**, 300–312 (2022).
- [45] Michelogiannakis, G. *et al.* A case for intra-rack resource disaggregation in hpc. *ACM Transactions on Architecture and Code Optimization (TACO)* **19**, 1–26 (2022).
- [46] Fahrenkopf, N. M. *et al.* The aim photonics mpw: A highly accessible cutting edge technology for rapid prototyping of photonic integrated circuits. *IEEE Journal of Selected Topics in Quantum Electronics* **25**, 1–6 (2019).



FIG. Extended Data 1. **Device capacitances.** **a**, Schematic of the transmitter and receiver capacitance sources; 1: electronic driver circuit, 2: electronic chip pads, 3: bump parasitics, 4: micro-disk PN junction, 5: photodiode PIN junction, 6: photonic chip pads. **b**, Measured imaginary impedances of devices and their fitted capacitor impedances. **c**, Measured micro-disk junction capacitance as a function of reverse bias voltage.



FIG. Extended Data 2. **Transmitter and receiver electronic circuit schematics.** **a**, Transmitter driver circuit schematic and driver highlighted in the transmitter illustration; data as  $V_{in}$  drives the micro-disk as  $V_{out}$ . **b**, Receiver circuit schematic and the receiver highlighted in the receiver illustration; amplification converts  $I_{in}$  to  $V_{out}$ . In the schematics, a multiplier labels each transistor width as a multiple of a ‘1x’ transistor. The ‘1x’ transistor has a width of 500 nm and length of 30 nm in the transmitter and 300 nm by 30 nm in the receiver.

## Supplementary Information: 3D photonics for ultra-low energy, high bandwidth-density chip data links

Stuart Daudlin<sup>1</sup>, Anthony Rizzo<sup>1,2</sup>, Sunwoo Lee<sup>3</sup>, Devesh Khilwani<sup>3</sup>, Christine Ou<sup>3</sup>, Songli Wang<sup>1</sup>, Asher Novick<sup>1</sup>, Vignesh Gopal<sup>1</sup>, Michael Cullen<sup>1</sup>, Robert Parsons<sup>1</sup>, Alyosha Molnar<sup>3</sup>, and Keren Bergman<sup>1,\*</sup>

<sup>1</sup>*Department of Electrical Engineering, Columbia University, New York, NY 10027*

<sup>2</sup>*Air Force Research Laboratory Information Directorate, Rome, NY 13441*

<sup>3</sup>*Department of Electrical and Computer Engineering, Cornell University, Ithaca, NY 14853*

### SUPPLEMENTARY NOTE 1: LIGHT SOURCE ENERGY CONSUMPTION

While a detailed analysis of the energy consumption of the electronic-photonic interface is provided in the main text, the energy consumption associated with the light source must also be considered to give a comprehensive evaluation of the total system energy. In this Supplementary Note, we provide an estimated laser energy consumption for two distinct classes of light sources: (i) distributed feedback laser arrays and (ii) integrated frequency comb sources. Furthermore, we show a detailed path to improving the link budget through experimentally validated low-loss chip-fiber couplers fabricated in the same process flow used for the photonic chips demonstrated in the main text. Through the reduced losses from these improved coupling interfaces, the required optical power from the laser sources are reduced and thus the associated energy consumption is projected to be substantially lower in future iterations of the system.

#### Reduction of Chip-Fiber Interface Losses

The total losses accrued across the full data communication link directly dictate the required optical power for each wavelength, and thus decreasing these losses directly results in reduced energy consumption of the optical source. In this demonstration, one of the main loss mechanisms is through each chip-fiber interface which occurs at three points in the link (source to transmitter, transmitter to fiber, and fiber to receiver). This loss is greatly improved relative to standard silicon photonic edge couplers by implementing suspended edge couplers which reduce the modal overlap between the inverse taper tip and the silicon substrate. Since the waveguide mode field diameter (MFD) is expanded using an inverse taper structure to match a standard SMF-28 fiber mode ( $MFD \approx 10 \mu\text{m}$ ) [1], this large mode has substantial overlap with the silicon substrate at the chip facet since the edge coupler tip is only vertically separated from the substrate by  $2 \mu\text{m}$  of buried oxide (BOX). This overlap distorts the symmetry of the modal shape, resulting in substantial mode-mismatch loss between the edge coupler and fiber. However, this modal

overlap with the substrate can be eliminated through selective substrate removal beneath the edge coupler, which is then replaced with silicon dioxide with an index of refraction matched to the rest of the oxide cladding.

Fig. 1a shows the simulated modes for the fiber and edge coupler both for the standard design (including silicon substrate) and the suspended design (with the silicon substrate removed). From these simulations, it is clear that the suspended design yields better modal overlap between the fiber and edge coupler due to the improved symmetry. We implemented these suspended edge couplers in the foundry process used for the photonic chips in the demonstrated electronic-photonic transceiver (AIM Photonics). This process change was implemented at wafer-scale on 300 mm wafers. While the suspended edge couplers in the transceiver demonstrated a coupling loss of 3 dB per facet, this loss deviates from the simulated performance due to the absence of a deep trench etch on the bumped photonic wafers. Without this deep trench etch, the chips are mechanically diced without a smooth facet, thus adding additional loss. However, test devices in the same process including the deep trench etch exhibited experimentally measured coupling losses as low as 1.1 dB/facet with standard SMF-28 fiber and index matching fluid (Fig. 1c). In future link implementations using these improved devices with both the selective substrate removal and deep trench etch with polished facet, the link budget can be improved by 5.7 dB (1.9 dB improvement at each fiber-chip interface) to yield 8.8 dB total link losses. With the experimentally measured receiver sensitivity of -19.5 dBm, this indicates that the optical power required from the laser at each wavelength can be reduced from -5 dBm to -10.7 dBm (0.085 mW).

#### Distributed Feedback Laser Array Energy Consumption

Distributed feedback (DFB) laser arrays represent a mature multi-wavelength light source for wavelength division multiplexing (WDM) and have been shown in numerous demonstrations for applications in optical communications [5–7]. Since DFB lasers tend to exhibit peak wall-plug efficiency (WPE) at higher optical output powers, the best achievable WPE rises from around 18% at 10 mW optical power [2] to 35% at 250 mW optical power [3]. To leverage this inherent property of the laser op-



**FIG. 1. Suspended edge couplers.** **a**, Cross-sectional views and simulated mode profiles for SMF-28 fiber, silicon nitride (SiN) edge coupler without substrate removal, and SiN edge coupler with selective substrate removal. Through removing the substrate, it is clear that the symmetry of the edge coupler modal profile more closely overlaps with that of SMF-28 fiber. **b**, Annotated microscope image of a fabricated suspended edge coupler with the removed substrate clearly visible. **c**, Experimentally measured test device results for various edge coupler tip widths as a function of taper length showing an optimal coupling loss of -1.1 dB/facet.



**FIG. 2. DFB array energy consumption.** Energy consumption per bit for a DFB laser array as a function of laser WPE. At 18% WPE [2], the array consumes 175 fJ/bit with the demonstrated edge couplers and 47 fJ/bit with improved edge couplers.

eration for optimal energy allotment, we assume an array of high power DFB lasers at each WDM wavelength which are multiplexed together and then split into various ports where each wavelength in each port has just enough power to overcome the link losses [8]. However, splitting 250 mW of optical power into streams of -5 dBm quickly becomes impractical, and thus we assume that in the DFB case the ideal laser operating condition is at 10 mW with 18% WPE. For simplicity in the analysis, we treat each wavelength in each port after splitting as its own independent source (“effective laser”). Given the experimentally measured optical link losses of 14.5 dB for the amplifier-free chip-to-chip link demonstrated in the main text and receiver sensitivity of -19.5 dBm, this



**FIG. 3. Kerr comb energy consumption.** Energy consumption per bit for a Kerr comb source pumped by a DFB as a function of pump DFB WPE. At 35% DFB WPE [3] and 40% pump-to-comb conversion efficiency [4], the comb consumes 225 fJ/bit with the demonstrated edge couplers and 60 fJ/bit with improved edge couplers.

yields an optical power requirement of -5 dBm for each wavelength with standard edge couplers and -10.7 dBm with improved edge couplers. Assuming a WPE of 18% at -5 dBm (0.316 mW) optical power after splitting, each “effective laser” consumes 1.76 mW of electrical power. At a data rate of 10 Gb/s per “effective laser”, this yields a total energy consumption of 175 fJ/bit. For the case of improved edge couplers, this energy consumption drops to 47 fJ/bit. Since the WPE of DFBs can vary significantly between designs, the energy consumption per bit is provided as a function of WPE in Fig. 2. However, through moving to integrated Kerr frequency combs which rely on higher pump power for seamless conversion



**FIG. 4. Effect of Kerr comb conversion efficiency (CE) on energy consumption.** Energy consumption per bit for a Kerr comb source pumped by a DFB as a function of pump DFB WPE for various pump-to-comb conversion efficiencies. At 35% DFB WPE [3] and 80% pump-to-comb conversion efficiency [16], the comb consumes 113 fJ/bit with the demonstrated edge couplers and 30 fJ/bit with improved edge couplers. All curves are plotted assuming improved edge couplers.

into new frequency channels, the higher WPE regime of high-power DFB lasers can be accessed without excessive splitting penalties.

#### Integrated Comb Source Energy Consumption

While DFB arrays are a mature solution for WDM systems, integrated comb sources are also highly appealing due to their ability to provide many wavelength channels from a single device [9–12]. Recent demonstrations have shown that integrated Kerr frequency combs can be coherently combined to boost the power-per-line while also permitting spectral shaping to flatten the comb spectrum [13, 14], making their deployment as WDM sources in future photonic interconnects a realistic prospect. In particular, nonsolitonic Kerr frequency combs in the normal group velocity dispersion (GVD) regime exhibit much higher pump-to-comb conversion efficiencies compared to soliton Kerr combs in the anomalous GVD regime [4, 15]. While standard coupled-resonator designs for nonsolitonic Kerr combs can achieve conversion efficiencies (CE) as high as 41% [4], recent demonstrations of advanced designs have shown pump-to-comb CE approaching unity (86%) [16]. The comb source energy consumption for both edge coupler cases is shown in Fig. 3 as a function of pump laser WPE. Fig. 4 shows the energy per bit as a function of pump DFB WPE for various pump-to-comb CEs. Using a high power DFB pump with 35% WPE and a pump-to-comb CE of 40%, the energy consumption of the comb source is 60 fJ/bit (assuming improved edge couplers). However, using an improved CE of 80% as demonstrated in ref. [16], this energy consumption improves by a factor of 2 to 30 fJ/bit.



**FIG. 5. Microdisk modulator resonance variation.** **a**, Measured spectrum for resonator buses across 62 reticles on a 300 mm wafer. Each bus contains four resonators, each targeted at a different nominal wavelength ( $\lambda_0$ ,  $\lambda_1$ ,  $\lambda_2$ , and  $\lambda_3$ ). **b**, Histogram of resonance wavelength for each nominal device across all 62 reticles, with the standard deviation labeled for each target wavelength. The mean of the standard deviation across all devices is  $\sigma_{avg} = 0.69$  nm.

#### SUPPLEMENTARY NOTE 2: THERMAL TUNING ENERGY CONSUMPTION

Since the laser wavelengths are assumed to be fixed under realistic field conditions in a deployed system, the resonators must be tuned to align with the laser wavelength grid. The two contributions to the total required resonator tuning are deviations due to fabrication variations (static) and deviations due to temperature fluctuations (dynamic). The magnitude of these variations are shown in Fig. 6 using realistic values for process variations and thermal swings. The simulations assume a thermo-optic coefficient of  $1.8 \times 10^{-4}$  K<sup>-1</sup> for intrinsic silicon near room temperature for  $\lambda = 1,550$  nm [17].



FIG. 6. **Temperature sensitivity of microdisk modulators.** Simulated temperature dependence for a microdisk modulator with a free spectral range (FSR) of approximately 24 nm near  $\lambda = 1,550$  nm. The slope of the linear fit is 108 pm/K.

The resonator free spectral range (FSR) was measured to be approximately 24 nm near a wavelength of 1,550 nm, and a full FSR corresponds to a  $2\pi$  phase shift. Therefore, the slope of the curve shown in Fig. 6 is equivalent to 0.03 radians/K. While the electrical power required to yield a  $\pi$  phase shift ( $P_\pi$ ) is approximately 30 mW for standard microdisk modulator integrated heater designs, we have recently demonstrated improved designs with a selective substrate undercut which exhibit  $P_\pi = 6.9$  mW [18].

While the thermal fluctuations of the photonic chip result in resonator drift, the integrated heaters in each resonator are additionally used to trim the resonance wavelength to compensate for fabrication variations. Using experimentally measured wafer-scale data for resonators across a full 300 mm wafer, we quantify the magnitude of the resonance variations to be  $\sigma = 690$  pm per device (Fig. 5). We use this value together with the experimentally measured 6.9 mW  $P_\pi$  to quantify the energy consumed by each resonator due to fabrication variations. The device tuning efficiency is 0.45 radians/mW, or 1.74 nm/mW. Thus, to correct for a one  $\sigma$  resonance variation of 690 pm, we require 0.4 mW per resonator. At a data rate of 10 Gb/s, this corresponds to 40 fJ/bit energy consumption required to correct for fabrication variations. If we consider  $3\sigma$  variations, the energy consumption rises to 120 fJ/bit.

In the energy consumption analysis, we consider corners based on ‘best case’ and ‘worst case’ values for both the fabrication variations and thermal fluctuations. We assume that under a normal, quasi-constant load, the thermal environment for the photonic devices will fluctuate on the order of  $\Delta T = 10$  K. If we assume the temperature offset from baseline at each time slice follows a normal distribution defined over the range  $\Delta T = 0$  K



FIG. 7. **Power and energy consumption of microdisk modulators.** **a**, Calculated power of each resonant device in milliwatts as a function of resonance variation due to fabrication and temperature variation. **b**, Calculated energy consumption per bit showing the ‘best case’ value of 71 fJ/bit and ‘worst case’ value of 274 fJ/bit.

to  $\Delta T = 10$  K, the mean temperature offset is  $\Delta T = 5$  K. Under more dynamic thermal loads, we assume that the photonic devices can fluctuate up to 50 K and then following the same logic, the average offset is  $\Delta T = 25$  K. For the ‘best case’ fabrication variations, we assume one  $\sigma$  values, whereas for the ‘worst case’ we assume  $3\sigma$  values. These corners are summarized in the plot shown in Fig. 7, with the ‘best case’ energy consumption calculated at 71 fJ/bit and ‘worst case’ energy consumption calculated at 274 fJ/bit.

**SUPPLEMENTARY NOTE 3:  
TRANSIMPEDANCE AMPLIFIER ENERGY  
CONSUMPTION**

We explore the trade-offs between noise, bandwidth, and power of the receiver TIA by starting with the input-referred thermal noise current per Hz of the TIA,

$$\frac{\overline{i_{n,in}^2}}{Hz} = \frac{4kT}{R_f} + \frac{4kT}{2g_m}\gamma C^2(2\pi f)^2 \quad (1)$$

where  $k$  is the Boltzmann constant,  $T$  is temperature,  $\gamma$  is the excess noise factor of the transistors,  $f$  is frequency,  $R_f$  is the TIA feedback resistance,  $g_m$  is the transistor transconductance, and  $C$  is the TIA input capacitance. The second term dominates the noise; integrating this term over frequency from zero to the channel bandwidth ( $f = BW$ ) and removing constants,

$$\overline{i_{n,in}^2} \sim \frac{1}{g_m} C^2 BW^3 \quad (2)$$

where  $\overline{i_{n,in}^2}$  is the total input-referred noise current. The signal-to-noise ratio,  $SNR$ , is the squared input signal current,  $I^2$ , divided by the total input noise,

$$SNR \sim I^2 \frac{g_m}{C^2 BW^3}. \quad (3)$$

Seeing that  $g_m$  is proportional to receiver static biasing power and  $BW$  is proportional to bits/second,

$$SNR \sim \left(\frac{I}{BW}\right)^2 \frac{E/bit_{RX}}{C^2}. \quad (4)$$

- 
- [1] Almeida, V. R., Panepucci, R. R. & Lipson, M. Nanotaper for compact mode conversion. *Optics letters* **28**, 1302–1304 (2003).
- [2] Koch, B. R. *et al.* Integrated silicon photonic laser sources for telecom and datacom. In *Optical Fiber Communication Conference*, PDP5C-8 (Optica Publishing Group, 2013).
- [3] Morrison, G. B. *et al.* High power single mode photonic integration. In *2019 IEEE High Power Diode Lasers and Systems Conference (HPD)*, 47–48 (IEEE, 2019).
- [4] Kim, B. Y. *et al.* Turn-key, high-efficiency kerr comb source. *Optics letters* **44**, 4475–4478 (2019).
- [5] Wade, M. *et al.* An error-free 1 tbps wdm optical i/o chiplet and multi-wavelength multi-port laser. In *Optical Fiber Communication Conference*, F3C-6 (Optica Publishing Group, 2021).
- [6] Kumar, R. *et al.* Demonstration of a hybrid iii-v/si multi-wavelength dfb laser for high-bandwidth density i/o applications. In *Optical Fiber Communication Conference*, Tu2E-5 (Optica Publishing Group, 2022).
- [7] Li, J. *et al.* An eight-wavelength bh dfb laser array with equivalent phase shifts for wdm systems. *IEEE Photonics Technology Letters* **26**, 1593–1596 (2014).
- [8] Buckley, B. B. *et al.* Wdm source based on high-power, efficient 1280-nm dfb lasers for terabit interconnect technologies. *IEEE Photonics Technology Letters* **30**, 1929–1932 (2018).
- [9] Gaeta, A. L., Lipson, M. & Kippenberg, T. J. Photonic-chip-based frequency combs. *nature photonics* **13**, 158–169 (2019).
- [10] Chang, L., Liu, S. & Bowers, J. E. Integrated optical frequency comb technologies. *Nature Photonics* **16**, 95–108 (2022).
- [11] Rizzo, A. *et al.* Massively scalable kerr comb-driven silicon photonic link. *Nature Photonics* 1–10 (2023).
- [12] Rizzo, A. *et al.* Petabit-scale silicon photonic interconnects with integrated kerr frequency combs. *IEEE Journal of Selected Topics in Quantum Electronics* **29**, 1–20 (2022).
- [13] Kim, B. Y. *et al.* Coherent combining for high-power kerr combs. *Laser & Photonics Reviews* 2200607 (2023).
- [14] Kim, B. Y. *et al.* Synchronization of nonsolitonic kerr combs. *Science Advances* **7**, eabi4362 (2021).
- [15] Jang, J. K. *et al.* Conversion efficiency of soliton kerr combs. *Optics Letters* **46**, 3657–3660 (2021).
- [16] Zang, J., Yu, S.-P., Carlson, D. R., Briles, T. C. & Papp, S. B. Near unit efficiency in microresonator combs. In *CLEO: Science and Innovations*, STh4F-3 (Optica Publishing Group, 2022).
- [17] Komma, J., Schwarz, C., Hofmann, G., Heinert, D. & Nawrodt, R. Thermo-optic coefficient of silicon at 1550 nm and cryogenic temperatures. *Applied Physics Letters* **101** (2012).
- [18] Rizzo, A. *et al.* Ultra-efficient foundry-fabricated resonant modulators with thermal undercut. In *CLEO: Science and Innovations*, SF2K-6 (Optica Publishing Group, 2023).