

> REPLACE THIS LINE WITH YOUR MANUSCRIPT ID NUMBER (DOUBLE-CLICK HERE TO EDIT) <

# Heterogeneous Integration in Co-Packaged Optics

Yu-Tao Yang, and Chih-Ming Hung, *Fellow, IEEE*

**Abstract**—Generative artificial intelligence (GAI) and Large Language Model (LLM) require data center to have higher bandwidth, and better energy efficiency. To achieve this, Co-packaged optics (CPO) is one of the future directions that leverages advanced packaging with integrated photonics. However, this tight integration complicates data center system design and multi-physics interactions, including electrical, optical, thermal, mechanical, and material aspects. In this paper, heterogeneous integration (HI) in CPO is discussed. Multi-physics packaging is exemplified with two cases. Challenges in HI technologies are reviewed and corresponding mitigation methods are provided, including (1) thermal crosstalk within the electrical domain and between the electrical and the optical interaction, (2) SIPI of wide-and-slow and narrow-and-fast channel links, and (3) pros and cons of interposer material. Integrated photonics part is introduced and is composed of (1) light sources, (2) optical coupling strategies, (3) fiber attach schemes with advanced packaging, and (4) integrated optical technologies, e.g. novel microlens, optical TSV, 3D waveguide, and optical 3DIC. This article aims to identify the key HI challenges in CPO and points out the potential solutions for future CPO system advancement.

**Index Terms**—Co-packaged optics, Data center, Silicon photonics, Multi-physics heterogeneous integration, Wireline communication

## I. INTRODUCTION

AI, and LLM speed up the demand of high performance computing with high bandwidth due to the explosive growth of embedded parameters of neural network. In 2013, Alexnet with 60 million of parameters, which is a convolutional neural network (CNN) architecture, marks among the first to reach the training compute era of one Exa-FLOPS. In 2020, GPT-3, which is an autoregressive transformer model with 175 billion of parameters, requires one million Exa-FLOPs of computation. This is followed by GPT-4 with 1.8 trillion of parameters and by GPT-5 with estimated 3 to 5 trillion of parameters, as shown in Fig.1 (a).

The desired scale of computation inevitably generates the same scale of data, up to 90,000 petabits by 2027, as shown in Fig. 1(b). Energy and power consumption are both accelerated upward accordingly. In Fig. 1(c), for example, computing power with GAI alone may increase much faster than 200% in three years as compared to global energy generation ramping up merely 6% in three years. This indicates computing power may soon exceed global energy generation before new computing architectures are massively deployed in data centers.

Within the increased amount of energy, there is around 25% to 50% of consumption is through wired electrical and optical data communication although the transistor scaling is moving toward 1 aJ/b of energy efficiency. Therefore, the direction for future wireline scaling targets higher performance, higher data bandwidth, but at the same time lower power consumption per bit. One of the future solutions is through complex and compact HI that co-designs electrical, optical, thermal, mechanical, and material sub-parts.

Fig. 2 shows a conceptual example of ultimate compact integration of various components, including three-dimensional integrated circuits (3DIC), 3D memory, interposer/substrate, through silicon via (TSV), micro-fluid cooling channel, optical source, waveguide, optical transceiver, fiber attachment, etc. The 3DIC is a lump summation and can refer to any main dies, e.g. application specific integrated circuits (ASIC), switch dies, XPU, etc. The 3D memory can be represented by high bandwidth memory (HBM), etc. This level of integration targets minimizing resistive copper (Cu) electrical communication between the compute part, the memory part, and the optical part.

The challenge, however, is to co-optimize among multi-physics interactions. For instance, regarding the system design consideration between electrical, mechanical: when multiple chipsets are integrated on one substrate, this substrate is required to be large enough with margins to accommodate all the chip area. This large substrate dimension naturally has large warpage due to the material mismatch of various embedded core and build-up layers. Warpage affects the exposure depth of focus of a stepper during a lithography process, limiting minimum resolvable line and space (L/S) of Cu wires and traces. Cu L/S may restrain electrical routing of power, clock, signal, and isolation between chipsets. This requires significant optimization and back-and-forth efforts in electrical place and route, fan-out design, power integrity (PI), signal integrity (SI), loss compensation, etc. Warpage in a large substrate also puts limitation on reliable minimum bump pitch, dimension, and die-to-die spacing across the center and the edge area due to mechanical and reliability challenges. Die spacing, I/O pitch, and Cu L/S constrain fan-out routability and bump-to-bump distance for all the I/Os, affecting reach distance, bandwidth and bandwidth density, energy efficiency in various die-to-die communication protocols

Yu-Tao Yang is with MediaTek USA Inc., San Jose, CA 95134 USA (yutaoyang@mediatek.com).

Chih-Ming Hung is with MediaTek Inc, Hsinchu Science Park, Hsinchu City 300, Taiwan (cmhung@mediatek.com).

> REPLACE THIS LINE WITH YOUR MANUSCRIPT ID NUMBER (DOUBLE-CLICK HERE TO EDIT) <



**Fig. 1. (a)** Growth of AI training and corresponding training compute required in FLOPS, GPT-5 is estimated based on early release [1-3] **(b)** Bandwidth shipping by market [1-3] **(c)** Total computing power and energy consumption per year [1-3]

Another example considering the system design drawing in Fig. 2 among electrical, optical, thermal, and mechanical: after electrical chips place and route optimization, electrical signals can go through vias (TVs) to optical transceivers vertically. This can lead to optimization of place and route for optical chips and waveguide, selection of integrated and standalone laser sources, and choice of optical connectivity and testing through an edge coupling or a grating coupling. The TV in an interposer can be leveraged as electrical TV and thermal TV and the height is related to the interposer/substrate thickness, which leads to tradeoff between SIPI in TV, warpage, and place & route due to keep-out zone constraints. When electrical and optical chipsets operate simultaneously, thermal hotspots are created, putting challenges in thermal crosstalk, including (1) electrical-to-electrical, and (2) electrical-to-optical. The thermal requirements for electrical chipsets are to be cooled under junction temperatures and to deal with a record-high thermal density. For optical chipsets, besides junction temperatures, the extra requirement is thermal stability and thermal-induced

reliability challenges in the component-level lasers, both of which require delicate system design optimization.



**Fig. 2** Conceptual structure of one 3-D heterogeneous electrical and optical integrated system ©2011 IEEE [4]. The Vertical Cavity Surface Emitting Laser (VCSEL) in this figure is to illustrate optical source integration in a future system. For CPO applications, semiconductor lasers (e.g., DFB lasers and comb lasers) are commonly adopted.

> REPLACE THIS LINE WITH YOUR MANUSCRIPT ID NUMBER (DOUBLE-CLICK HERE TO EDIT) <



**Fig. 3.** CPO roadmap with tighter integration of EIC, PIC, and main dies. © 2021 [5]. This figure is openly licensed via CC BY-NC-ND 4.0

Among the real-world HI platforms, CPO is one of the cases that are similar to the integration scheme in Fig. 2 and the light source is replaced with semiconductor lasers. General optical connectivity consists of four major parts: (1) electrical integrated circuit (EIC) dies, including optical driver, transimpedance amplifier (TIA), retimer, (2) photonic integrated circuits (PIC): modulator, photodetector and optional laser source, (3) advanced packaging and integration, (4) fiber attach and optical coupling [6]. As shown in Fig. 3, the technology of CPO brings the front panel pluggable EIC& PIC closer to a main die and co-packages the main die and EIC& PIC on one substrate. Co-packaged schemes can be side-by-side 2.5D CPO, chiplet-based 2.5D CPO, and future on one interposer as stacked-based integration of 3D CPO, all of which are based on the high-bandwidth demand. This significantly reduces lossy and lengthy Cu traces for high-speed Serializer/deserializer (SerDes) application [7-11] and shorter reach of SerDes can be used for communication between optical transceiver dies and main dies. With much less equalizer on-chip design and without retimer chips, power consumption of wired electrical communication can be improved and the electrical beachfront bandwidth density can gradually match the optical counterpart. The technology of CPO has been demonstrated by various industrial companies, including MediaTek [12], Broadcom [13], Intel [14], Cisco [15], Marvell [16], and etc.

In this paper, advanced integration and optical coupling in CPO are the focus, as shown in Fig. 4. The challenges are introduced in section II, including electrical, optical, thermal, mechanical, and material.



**Fig. 4.** Anatomy of co-packaged optics system. The integration part is the focus in this paper, including large package, interposer, lossy channel, and fiber attach [17]

## II. HI CHALLENGES IN CPO

In this session, HI challenges are discussed as below: A. thermal, B. SI/PI of lossy channels, C. interposer/substrate material, D. light source, E. optical coupling strategy, F. fiber attach, and G. integrated optical technologies.

### A. Thermal



**Fig. 5** Predicted power consumption based on logic and memory switching and leakage power ©2011 IEEE [18]

In a data center, a PCB system in a rack box is composed of several main dies (ASIC, XPU), HBM, and optics. With future higher bandwidth chipsets and stacking applications, the power consumption per socket may be much higher than 2 kW based on Fig. 5 and Fig. 6, which can be relaxed with CPO under the same bandwidth. Due to compact integration between high-power main dies, HBM and optics, this amount of power and density is still challenging to be cooled and can cause reliability issues and high failure-in-time rate when not properly addressed. The paragraphs below describe how thermal effect impacts a CPO system from electrical and optical perspectives and propose potential thermal solutions.

> REPLACE THIS LINE WITH YOUR MANUSCRIPT ID NUMBER (DOUBLE-CLICK HERE TO EDIT) <

| Parameter                                | Current (2023)     | Proposed (2035)   |
|------------------------------------------|--------------------|-------------------|
| Architecture                             | Discrete GPU + CPU | xPU (CPU+GPU)     |
| FP64 Flops                               | $10^8$             | $10^{21}$         |
| Socket Size (mm <sup>2</sup> )           | ~1,000             | ~200,000          |
| Silicon Area (mm <sup>2</sup> ) / Socket | 4,000              | 500,000-1,000,000 |
| Transistors (Billions) / Socket          | 100                | 10,000            |
| Nodes                                    | 10,000             | 100               |
| Socket power (kW)                        | 2                  | 100               |
| Exaflops/MW(HPC - FP64)                  | 15                 | 1500              |
| Total System Power (MW)                  | 50-60              | 100-150           |

Fig. 6 Potential system attributes for the next era ©2023 IEEE [19]

For the electrical part, besides cooling the chipsets under device junction temperature, the new challenges are dealing with a record-high thermal design power (TDP) and corresponding thermal crosstalk. For example, HBM stacks multiple layers of Dynamic Random Access Memory (DRAM) and DRAM is sensitive to the transistor and capacitor leakage current, both of which can be significantly degraded in a high-temperature system and requires more frequent memory refresh operations. Due to a short-distance proprietary communication IP, HBM is inevitably assembled surrounding main dies with few mm distance, which leads HBM to be the thermal victim of high-power main dies. This thermal crosstalk stems from the lateral thermal transport through interposer, epoxy molding compound (EMC), and heat sink. Corresponding proper thermal isolation is needed: for example, reduced in-plane thermal conductivity of interposer and co-designed thermal isolation structure in heat sink. Besides side-by-side integration, there is a recent growing trend stacking memory directly on top of one main die, either with bare memory dies or with packaged chips. This gives higher bandwidth between memory and compute core but requires tighter thermal specs.



Fig. 7 Different types of TIM with corresponding thermal conductivity and target power usage ©2023 IEEE [21]

For optical chipsets, besides device junction temperature and thermal density in lasers, the new challenge is thermal uniformity and stability when nearby high-power main dies. In PIC, phases and wavelengths are key optical design parameters in waveguides, resonators, and modulators that are sensitive to refractive index change ( $\text{dn}/\text{dT} \sim 10^{-4}\text{K}^{-1}$ ). Depending on multiplexing selections and types of modulators, dynamic

thermal crosstalk from proximal main dies may cause instantaneous temperature variation of a few degrees across modulation area. This can potentially lead to multiplexing instability in a system, which requires effective and timely thermal control to overcome aperiodic heat waves.



Fig. 8 Transient loads using an objective function penalization method ©2025 IEEE [23]

Considering thermal requirements of record-high TDP, thermal crosstalk, and thermal stability, novel cooling strategies will be needed. Liquid-based cooling technologies may be required for heat flux  $>1\text{W/mm}^2$  [20]. Those technologies include two major categories: (1) devices direct contact w/ fluid: immersion cooling, electrowetting, etc; (2) devices indirect contact w/ fluid with thermal interface material (TIM) in between: two-phase heat pipe, jet impingement boiling, thermoelectric cooling, vacuum chamber, phase change material with pin fin heat sink, etc. The TIM layer is an intermediate material that sandwiches and fills out the uneven interface between main chipsets (ASIC, XPU, HBM) and indirect cooling methods. This facilitates thermal conduction flow and reduces thermal resistance along the pathway. The future direction for TIM development is to continue reducing thermal resistance, either through a higher thermal conductivity,  $k$ , or through thinner bottom-line thickness that is less than 100 um. As shown in Fig. 7, TIM can be categorized with three types: (1) polymer-based TIM, including grease, adhesive, etc, with  $k$  up to 15 W/mK, (2) metal and solder-based TIM, including alloys, metallic foils, etc, with  $k$  up to 80 W/mK, (3) carbon-based TIM, including graphite with  $k$  up to 40 W/mK [21], carbon nano tube with  $k$  up to 50 W/mK [22], diamond with  $k$  much higher than 100 W/mK. Besides cooling

> REPLACE THIS LINE WITH YOUR MANUSCRIPT ID NUMBER (DOUBLE-CLICK HERE TO EDIT) <

technologies and material innovation, recent research combines liquid-based microchannel heat sink with thermal-aware floor planning and topology optimization for continuous and transient main die workloads, which is one of the directions for future thermal design technology co-optimization, as shown in Fig. 8 [23].

### B. SIPI of lossy channels

SIPI is a concept that applies to the quality of signal distortion, signal dispersion and power distribution in electronic systems. It affects the performance of a CPO system in a way that generally can be characterized through channel eye diagram, bit error rate, etc. SIPI is required to ensure reliable and stable operation and communication between main dies, HBM, EIC, and PIC on an interposer and a package substrate. Detailed analysis of SIPI is beyond the scope of the paper and two communication examples are illustrated in the paragraphs below: (a) between main dies and HBM, (b) between main dies and EIC.



Fig. 9 Example of voltage overshoot based on varying channel dimension and eye diagram of (b) & (c) ©2024 IEEE [24]

Communication between main dies and HBM is through proprietary HBM IP that drives bump-to-bump distance of 5 mm with the frequency range adding margin potentially up to 10 to 15 GHz. SI metrics need to consider insertion/return loss, crosstalk, eye diagram (e.g., eye width and height, rising edge, falling edge, overshoot), timing skew and propagation delay, etc, as shown in Fig. 9. Those metrics are affected by RC,

layout, line and space, interposer, and package stack up (e.g., layers, material dielectric constant, loss tangent, and thickness [25, 26]). As shown in Fig. 10 (a), PI is affected by resistance, inductance, and time-varying current. Those values add up along the pathway, including microbump, Through Silicon Via (TSV), C4 bumps, redistribution layer, Plated Through Hole (PTH), BGA ball, as shown in Fig. 10 (b). The metrics in PI include IR drop, ripple, power droop, power delivery network impedance, as shown in Fig. 10 (c), etc. Electromagnetic interference of neighboring lanes, on the other hand, is growing severe and affects both SI and PI due to constraints of die area and die stacking.



Fig. 10 (a) Voltage drop based on resistance, inductance, and time-varying current [27], (b) Lossy pathway from chiplet to BGA [27], (c) Example of PDN impedance ©2024 IEEE [24]

Communication between main dies and EIC can be through (1) wide-and-slow protocols, e.g., Universal Chiplet Interconnect express (UCIE) that supports bandwidth density up to 10 Tbps/mm, and (2) narrow-and-fast protocols, e.g., SerDes that supports up to 224Gbps per lane and beyond. The challenges for UCIE are similar to HBM protocols therefore not repeating the contents again. For high-speed SerDes beyond 100 Gbps, depending on modulation type, the required Nyquist frequency bandwidth adding margin goes up to 80 GHz or higher, within which frequency range the challenges are different from the UCIE and new mitigation methods are needed. Two examples are brought up as below.



Fig. 11. Example of skip layer design for better SI of TX and RX for high-speed SerDes ©2024 IEEE [28]

> REPLACE THIS LINE WITH YOUR MANUSCRIPT ID NUMBER (DOUBLE-CLICK HERE TO EDIT) <

The first example is the package unit insertion loss, which can grow three-fold at Nyquist frequency even with the latest mitigation methods. Those techniques include lower-loss substrate core and build up layer material, thicker ABF film, smoother surface roughness, and skip layer routing design, as shown in Fig. 11. Besides material and substrate stack up, the bump design, e.g., pitch, ball size, pad scheme, wiring breakout style, etc., is another key improvement to reduce unexpected cavity resonance and to avoid additional modes within Nyquist frequency. Design examples include Hex pattern, as shown in Fig. 12 (a), ground fence, via junction, pattern optimization, and different wiring breakout style, as shown in Fig. 12 (b).

Skew tuning, which is another example, results from the length difference of differential signals and trace asymmetry. Skew can reflect on both trace insertion loss and common mode noise, which affecting the system link budget. Generally, 0.3 UI tolerance is allowed for skew and can be allocated for components along signal traces, including on-die, package, PCB, connectors, flyover cables, and environment, e.g., bending, twist, temperature, humidity, etc. The mitigation techniques include serpentine parameters tuned per stack up and impedance control.



Fig. 12. (a) Example of pad arrangement scheme square vs hex, (b) Wiring breakout style ©2024 IEEE [28]

### C. Interposer and substrate material

Interposer and substrate material is important to consider in CPO systems since it affects: (1) loss of communication channels and SIPI, (2) heat conduction and crosstalk, (3) stress and warpage. Paragraphs below briefly describe organic, Si, and glass material and their pros and cons.

Organic material has low-cost process and high-volume manufacturable for 20+ years. It is relative simpler to laminate layers on organic-based cores, through which 11 layers per side has been productized in HPC products [29] and 12 layers and above are under development. The package dimension, however, is limited between 100 mm and 120 mm per side, caused by large warpage across package. This results from

material difference in coefficient of thermal expansion (CTE), young's modulus, Poisson's ratio, and material stiffness. Those various material types include Cu, fiber glass core, ABF, underfill, Si, and bumps. SI of high-speed SerDes beyond 200 Gbps per lane is another challenge using organic material for interposer and substrate due to lossy dielectric material, thick thru-core, and rough surface. There are continuing efforts on pushing high-frequency extreme with organic substrate using lower loss material, e.g., polyphenyl-ethers (PPE), liquid-crystal polymer (LCP), polytetrafluoro-ethylene (PTFE), etc. [30]

Glass material is an emerging technology for low-cost, large-panel, and low-warpage process, which is improved by its tunable CTE and young's modulus that match close to Si dies and Cu traces. Those common glass types include Aluminosilicate, Boro-Aluminosilicate, Borosilicate, etc. Glass provides similar thermal conductivity as organic material, excellent low-loss core, and low surface roughness. Those properties make glass potentially one of material supporting high-speed SerDes beyond 200 Gbps. The integration level and layer count, however, are still under development and verification phase.

Although with much higher material and processing cost, Si interposer is predominantly used for high performance products that assemble main dies and HBMs with proximal hundreds of micrometers due to low-warpage surface. Si is proven to have precise-control and high-reliability process with high-volume manufacturing. Si, however, is an electrical conducting lossy dielectrics and cannot support long-reach high frequency SerDes on a large Si interposer and wafer-scale system [31], which requires new clock-forwarding, re-timing, fault-tolerant mesh network, etc. High thermal conductivity of Si interposer can be good heat spreader but on the other hand, it can cause significant thermal crosstalk when integrating high-power main dies close to HBM and PICs.

### D. Light source

Current light sources for optical communication include uLED [32] and laser depending on the reach distance and link budget of intra-rack and inter-rack applications. The uLED light source, as shown in Fig. 13, is an emerging technology and mostly integrated in a large array with transmitters. This large array can provide massive wide-and-slow channels and light source redundancy for required reliability specification.

> REPLACE THIS LINE WITH YOUR MANUSCRIPT ID NUMBER (DOUBLE-CLICK HERE TO EDIT) <



Fig. 13 Example of uLED light source integrated on chip  
©2024 IEEE [32]

For current optical communication in CPO, the laser light source, including distributed-feedback laser (DFB) lasers and comb lasers, is still required due to much higher pristine optical output power. With wavelength division multiplexing (WDM), multi-lambda optical laser can leverage an array of single-lambda DFB or combs with a pump laser. Adoption of which type of laser involves CPO system and rack design in a data center, which beyond the scope of this paper. Lasers can be sourced internally within a PIC as integrated lasers and externally from a standalone laser module. Based on the III-V material processing before and after integration, integrated lasers can be classified as flip-chip bonded lasers and heterogeneous integrated lasers. The heterogeneous integrated laser has been experimentally proven to be reliable for years [33, 34], while the flip-chip bonded laser is under verification phase. External lasers, on the other hand, couple light into PIC through fiber to on-PIC waveguides. Although this creates higher optical loss due to multiple interfaces and fiber-core-to-waveguide misalignment, separation of laser sources from the main die package may reduce risks of laser reliability and increases multi-source interoperability and serviceability.

#### E. Optical coupling strategies

TABLE I Fiber coupling methods vs loss, bandwidth, alignment tolerance, and challenges. The coupling methods include grating, edge, evanescent, microlens, photonic wire bonding, vertical coupler [35], detachable coupling [36,37]

|                            | Loss     | Bandwidth        | 1 dB alignment tolerance         | Challenges                                                    |
|----------------------------|----------|------------------|----------------------------------|---------------------------------------------------------------|
| <b>Grating</b>             | >1.5dB   | 30-80 nm for 1dB | ~2 um                            | High coupling loss                                            |
| <b>Edge</b>                | > 0.5 dB | >100 nm For 3dB  |                                  | 1.Alignment throughput<br>2.Wafer cleaving<br>3.Wafer sorting |
| <b>Evanescent</b>          | > 0.1 dB | >300nm For 1 dB  | >2.8 um                          | 1.Exposed waveguide<br>2.precise alignment                    |
| <b>Microlens</b>           | > 1.7dB  |                  | ~30 um                           | 1. Alignment<br>2. Automatic testing                          |
| <b>Photonic wireboning</b> | > 0.5dB  | 300nm For 1 dB   |                                  | Laser writing time                                            |
| <b>Vertical coupler</b>    | <0.1     | >100 nm          | ~2.5 um in Y & Z<br>>50um in gap |                                                               |
| <b>Detachable</b>          | <1.5 dB  |                  | +~ 40 um                         |                                                               |

Conventional EC



Vertical Coupler



Fig. 14 Vertical coupler ©2024 IEEE [35]



Fig. 15 Example of detachable connector design, including all the mechanical structure ©2023 IEEE [37]

Among all the coupling strategies, grating coupling (GC) and edge coupling (EC) are the main types. When GC is selected, it is limited by multiplexing types, angle of incidence, polarization, narrow bandwidth, and higher coupling losses although higher bandwidth density is possible with multiple rows of fibers. EC has properties of broad multiplexing and broadband capability, negligible polarization dependence and high coupling efficiency but wafer sorting is challenging. GC, EC and variants of EC (e.g. microlenses, evanescent, photonic wire bonding, vertical coupler as shown in Fig. 14) are summarized in the Table 1 [38] with corresponding loss, bandwidth, alignment tolerance and challenges. In the next session of fiber attach, EC will be exemplified.

> REPLACE THIS LINE WITH YOUR MANUSCRIPT ID NUMBER (DOUBLE-CLICK HERE TO EDIT) <

### F. Fiber attach

TABLE II Fiber first vs. fiber last challenges [39-41]

|                                     | Fiber first                                             | Fiber last                                             |
|-------------------------------------|---------------------------------------------------------|--------------------------------------------------------|
| Mechanical handling and assembly    | Delicate fiber module handling in each assembly process | Standard microelectronics assembly before fiber attach |
| Planarity during fiber attach       | Best condition with Die-level warpage                   | Substrate-level adding lid-level                       |
| Strain relief                       | No                                                      | Yes by metal lid                                       |
| Test of known-good sub-engines      | yes                                                     | Not until assembly is done                             |
| Reflow compatible fiber and ferrule | Required high temperature ferrule                       | Not required                                           |

Edge coupling fiber attach can be classified in the assembly flow with two categories: fiber-first and fiber-last. In a fiber-first process, fiber arrays are positioned on a stress-free, nearly perfectly flat PIC. The PIC is controlled with uniform thickness, so after the initial setup of the assembly stage, no further adjustments are needed to maintain coplanarity between the V-grooves and fiber planes. In the fiber-last method, on the contrary, the fiber interface area is affected by the substrate warpage due to flip-chip soldering and underfill curing. Minimizing PIC assembly warpage is crucial to ensure that the parallelized fibers are fully inserted into the V-grooves with the pressure applied via the buffer lid. Thoughtful design and the selection of suitable substrate materials help reduce warpage and address optical fiber alignment challenges. The comparison of fiber-first and fiber-last approaches is summarized in Table II



Fig. 16 Integration variants of EIC, PIC, interposer, fiber attach, and substrate [42]. The edge fiber attach can be either v-groove-based attach or multi-microlens-assisted methods.

In the advanced packaging, there are innovations for EIC and PIC assemblies, including side-by-side, embedded in substrate, 2.5D stacked on interposer, and 3D stacked. PIC can face up and face down with EIC and the edge fiber attach can be either v-groove-based attach or multi-microlens-assisted methods. As shown in Fig.16, the integration scheme includes a) to l) and each category is introduced as below:

- (a) Flip-chip EIC and PIC on a substrate with cut off at edge
- (b) EIC and PIC facing down and PIC protruding from the substrate edge
- (c) EIC and PIC facing down on substrate without cutout
- (d) PIC facing up and embedded in a recessed substrate with EIC/PIC direct 3D stacked,
- (e) PIC facing down soldering to a substrate with EIC thinned and solder on the same substrate
- (f) Embedded EIC facing up with PIC face-to-face
- (g) EIC facing down and partially on PIC facing up with another interposer or substrate
- (h) EIC on interposer with PIC embedded in substrate

> REPLACE THIS LINE WITH YOUR MANUSCRIPT ID NUMBER (DOUBLE-CLICK HERE TO EDIT) <

- (i) PIC on interposer with thinned EIC below interposer
- (j) PIC on interposer with thinned EIC flipped soldering on the other side of interposer
- (k) EIC on PIC 3D integration with TSV and PIC facing up. (l) PIC on EIC 3D integration with TSV and PIC facing down. The down-selection of those schemes is based on link budget and system design constraints.

#### G. Integrated optical technologies

Besides advanced packaging with fiber attach, there are integrated optical innovations targeting co-integration with electronics:

- (1) Integrated lens: a) Integrated lensed edge coupler formed by dielectrics, as shown in Fig. 17 [43], and b) Integrated Microlens Coupler (IMC): wafer-scale optical packaging with SU8-based spherical lens adding polymer waveguide [44]
- (2) Optical TSV: a) transmitting optical signals from substrate to PIC through micro-mirror, as shown in Fig. 18(a) [45], and b) interlayers on PIC through layer transfer process, as shown in Fig. 18 (b) [46]
- (3) Optical wafer scale processing: a) die-to-wafer assembly process for optically interconnected System-on-wafer, as shown in Fig. 19 [47], and b) wafer-scale waveguide, which is fabricated with stepper and design of reticle edges [48]
- (4) 3D waveguides: it is through 3D ultrafast laser inscription technologies for realizing arbitrary interfaces between the two different systems, as shown in Fig. 20 [45]
- (5) Conceptual optical 3D IC with electrical and optical co-wiring, as shown in Fig. 21 [49].



Fig. 17 Integrated lensed edge coupler ©2018 IEEE [43]



Fig. 18 (a) Optical TSV implemented through curved micro-mirror ©2019 IEEE [45] (b) Optical TSV implemented through angled interconnects and layer transfer ©2020 IEEE [46]



Fig. 19 Die-to-wafer assembly process for optically interconnected System-on-wafer ©2024 IEEE [47]



Fig. 20 3D waveguides through 3D ultrafast laser inscription technologies ©2020 IEEE [45]

> REPLACE THIS LINE WITH YOUR MANUSCRIPT ID NUMBER (DOUBLE-CLICK HERE TO EDIT) <



Fig. 21 Conceptual drawing of 3DIC with electrical and optical routing and signaling ©2023 IEEE [49]

## V. CONCLUSION

In this paper, heterogeneous integration in CPO is discussed. Multi-physics packaging is exemplified with two cases, including a system design consideration between electrical and mechanical, and another system design consideration between electrical, optical, thermal, and mechanical. Challenges in HI technologies are discussed, and mitigation methods are provided: (1) thermal crosstalk within the electrical domain and between the electrical and the optical interaction, both of which can be minimized with novel cooling methods, TIM development, and cooling adding thermal-aware floor planning and topology optimization, (2) SIPI of wide-and-slow and narrow-and-fast channel links, which can be reduced by better isolation, pad arrangement scheme, new wiring breakout, skip layer stackup, impedance control, and serpentine skew control, (3) analysis of pros and cons of organic, Si and glass as interposer and substrate material. Optical technologies are introduced, including light sources of uLED and laser, coupling of GC and EC and EC variants, fiber attach process innovation with advanced packaging, and integrated optical technologies, e.g. novel microlens, optical TSV, 3D waveguide, and optical 3DIC. This article points out the key HI challenges and the potential solutions in CPO, paving the way for faster development of future CPO technologies.

## ACKNOWLEDGMENT

The authors would like to thank industrial and academic collaborators that involve in MediaTek CPO developments.

## REFERENCES

- [1] Chih-Ming Hung, "Challenges and Opportunities to Illustrate the AI World with Silicon Photonics," Silicon Photonics Global Summit, SEMICON TAIWAN, Sep. 2024
- [2] J. Sevilla, L. Heim, A. Ho, T. Besiroglu, M. Hobbahn and P. Villalobos, "Compute Trends Across Three Eras of Machine Learning," 2022 International Joint Conference on Neural Networks (IJCNN), Padua, Italy, 2022, pp. 1-8, doi: 10.1109/IJCNN55064.2022.9891914.
- [3] Semiconductor Research Corporation, The Decadal Plan for Semiconductors, 2023
- [4] K. -W. Lee, A. Noriki, K. Kiyoyama, T. Fukushima, T. Tanaka and M. Koyanagi, "Three-Dimensional Hybrid Integration Technology of CMOS, MEMS, and Photonics Circuits for Optoelectronic Heterogeneous Integrated Systems," in *IEEE Transactions on Electron Devices*, vol. 58, no. 3, pp. 748-757, March 2011, doi: 10.1109/TED.2010.2099870.
- [5] C. Minkenberg, R. Krishnaswamy, A. Zilkie, D. Nelson, "Co-packaged datacenter optics: opportunities and challenges" *IET Optoelectron.*, Vol. 15, pp. 77-91, 2021, doi: <https://doi.org/10.1049/ote2.120207>.
- [6] M. Mehta, "An AI Compute ASIC with Optical Attach to Enable Next Generation Scale-Up Architectures," *Hotchip2024*, pp. 1-30, Aug. 2024. doi:[https://hc2024.hotchips.org/assets/program/conference/day1/61\\_HC2024.Broadcom.ManishMehta.v2-NO-VIDEO.pdf](https://hc2024.hotchips.org/assets/program/conference/day1/61_HC2024.Broadcom.ManishMehta.v2-NO-VIDEO.pdf)
- [7] H. Park *et al.*, "A 4.63pJ/b 112Gb/s DSP-Based PAM-4 Transceiver for a Large-Scale Switch in 5nm FinFET," *2023 IEEE International Solid-State Circuits Conference (ISSCC)*, San Francisco, CA, USA, 2023, pp. 5-7, doi: 10.1109/ISSCC42615.2023.10067613.
- [8] R. Yousry *et al.*, "11.1 A 1.7pJ/b 112Gb/s XSR Transceiver for Intra-Package Communication in 7nm FinFET Technology," *2021 IEEE International Solid-State Circuits Conference (ISSCC)*, San Francisco, CA, USA, 2021, pp. 180-182, doi: 10.1109/ISSCC42613.2021.9365752.
- [9] T. Ali *et al.*, "6.2 A 460mW 112Gb/s DSP-Based Transceiver with 3dB Loss Compensation for Next-Generation Data Centers in 7nm FinFET Technology," *2020 IEEE International Solid-State Circuits Conference (ISSCC)*, San Francisco, CA, USA, 2020, pp. 118-120, doi: 10.1109/ISSCC19947.2020.9062925.
- [10] T. Ali *et al.*, "6.4 A 180mW 56Gb/s DSP-Based Transceiver for High Density IOs in Data Center Switches in 7nm FinFET Technology," *2019 IEEE International Solid-State Circuits Conference (ISSCC)*, San Francisco, CA, USA, 2019, pp. 118-120, doi: 10.1109/ISSCC.2019.8662523.
- [11] H. Park *et al.*, "A 212.5Gb/s DSP-Based PAM-4 Transceiver with 50dB Loss Compensation for Large AI System Interconnects in 4nm FinFET," *2025 IEEE International Solid-State Circuits Conference (ISSCC)*, San Francisco, CA, USA, to be published.
- [12] MediaTek, "MediaTek Partners with Ranovus to Enter Niche Market, Expands into Heterogeneous Integration Co-Packaged Optics Industry," TRENDFORCE, Mar. 2024 [Online]. Available: <https://www.trendforce.com/news/2024/03/21/news-mediatek-partners-with-ranovus-to-enter-niche-market-expands-into-heterogeneous-integration-co-packaged-optics-industry/>. [Accessed: Jan. 29.2025].
- [13] BROADCOM, "Broadcom's persistent cadence of co-packaged optics innovation," Feb. 2023 [Online]. Available: <https://www.broadcom.com/blog/broadcoms-persistent-cadence-copackaged-optics-innovation> (Accessed: Jan. 29.2025)
- [14] INTEL, "Intel combines optics to its Tofino 2 switch chip" Feb. 2023 [Online]. Available: <https://www.gazettabyte.com/home/2020/3/19/intel-combines-optics-to-its-tofino-2-switch-chip.html> (Accessed Jan. 29.2025)
- [15] CISCO, "Cisco Demonstrates Co-packaged Optics (CPO) System at OFC 2023" Mar. 2023, [Online]. Available: <https://blogs.cisco.com/sp/cisco-demonstrates-co-packaged-optics-cpo-system-at-ofc-2023> (Accessed Jan. 29.2025)
- [16] Marvell "Marvell Teralynx 10 Announced for 51.2T 800GbE Switching" Mar. 2023, [Online]. Available: <https://www.servethehome.com/marvell-teralynx-10-announced-for-51-2t-800gbe-switching/> (Accessed Jan. 29.2025)
- [17] Chih-Ming Hung, "Microsystem Integration for AI/Compute and Interface Wishes From Designers," IEEE EPS PANEL: Heterogeneous Integration Roadmap Compact session, International Microsystems, Packaging, Assembly and Circuits Technology Conference (IMPACT), Oct. 2024
- [18] International Technology Roadmap for Semiconductors (ITRS), 2011 edition, system drivers, 2011. [Online]. Available: <https://www.semiconductors.org/wp-content/uploads/2018/08/2011SysDrivers.pdf> (Accessed Jan. 29.2025)
- [19] W. Gomes, "Beyond Exascale: A Paradigm shift for AI and HPC," 2023 International Electron Devices Meeting (IEDM), San Francisco, CA, USA, 2023, pp. 1-4, doi: 10.1109/IEDM45741.2023.10413754.
- [20] Zhihao Zhang, Xuehai Wang, Yuying Yan, "A review of the state-of-the-art in electronic cooling," *e-Prime - Advances in Electrical Engineering, Electronics and Energy*, Vol. 1, pp. 1-26, 2021, doi: <https://doi.org/10.1016/j.prime.2021.100009>.
- [21] K. -C. Chang, M. -J. Lii, K. -M. Wang, C. -C. Wang and B. -L. Wu, "A Novel Indium Metal Thermal Interface Material and Package Design Configuration to Enhance High-Power Advanced Si Packages Thermal Performance," *2023 IEEE 73rd Electronic Components and Technology Conference (ECTC)*, Orlando, FL, USA, 2023, pp. 2079-2086, doi: 10.1109/ECTC51909.2023.00356.

> REPLACE THIS LINE WITH YOUR MANUSCRIPT ID NUMBER (DOUBLE-CLICK HERE TO EDIT) <

- [22] Shinko "Carbon Nanotube Thermal Interface Material (CNT-TIM)" 2024, [Online]. Available: [https://www.shinko.co.jp/english/product/docs/cnt\\_EN.pdf](https://www.shinko.co.jp/english/product/docs/cnt_EN.pdf)
- [23] Z. Wu, A. R Kidambili, Y.-T. Yang, C.-M. Hung, S. Tian, X. Zhang, J. A. Weibel, L. Pan, "Topology Optimization for Embedded Cooling of Multiple and Transient Workloads in 3D Semiconductor Packages," *The Intersociety Conference on Thermal and Thermomechanical Phenomena in Electronic Systems (ITherm2025)*, to be published.
- [24] J. Yoon et al., "The Energy-Efficient 10-Chiplet AI Hyperscale NPU on Large-Scale Advanced Package," *2024 IEEE 74th Electronic Components and Technology Conference (ECTC)*, Denver, CO, USA, 2024, pp. 1687-1693, doi: 10.1109/ECTC51529.2024.00279.
- [25] K. Cho et al., "Signal and power integrity (SI/PI) analysis of heterogeneous integration using embedded multi-die interconnect bridge (EMIB) technology for high bandwidth memory (HBM)," *2017 IEEE Electrical Design of Advanced Packaging and Systems Symposium (EDAPS)*, Haining, China, 2017, pp. 1-3, doi: 10.1109/EDAPS.2017.8277051.
- [26] S. -F. Yang, W. -C. Wang, Y. -T. Lin, C. -C. Hung, H. -Y. Tung and J. Hsieh, "Signal Integrity Designs at Organic Interposer CoWoS-R for HBM3-9.2Gbps High Speed Interconnection of 2.5D-IC Chiplets Integration," *2024 IEEE 74th Electronic Components and Technology Conference (ECTC)*, Denver, CO, USA, 2024, pp. 1098-1103, doi: 10.1109/ECTC51529.2024.00176.
- [27] W.-C. Wu, "Robust Circuit/Architecture Co-Design for Chiplet Integration" International Solid State Circuits Conference 2024 (ISSCC) Forums 1, San Francisco, CA, USA, 2024.
- [28] Francis Lin, "Highlights and Challenges in deploying 100G+ SerDes" International Solid State Circuits Conference 2024 (ISSCC) Forums 6.1, San Francisco, CA, USA, 2024.
- [29] W. Gomes et al., "Ponte Vecchio: A Multi-Tile 3D Stacked Processor for Exascale Computing," *2022 IEEE International Solid-State Circuits Conference (ISSCC)*, San Francisco, CA, USA, 2022, pp. 42-44, doi: 10.1109/ISSCC42614.2022.9731673.
- [30] A. O. Watanabe, M. Ali, S. Y. B. Sayeed, R. R. Tummala and M. R. Pulugurtha, "A Review of 5G Front-End Systems Package Integration," in *IEEE Transactions on Components, Packaging and Manufacturing Technology*, vol. 11, no. 1, pp. 118-133, Jan. 2021, doi: 10.1109/TCPMT.2020.3041412.
- [31] S. Pal et al., "Designing a 2048-Chiplet, 14336-Core Waferscale Processor," *2021 58th ACM/IEEE Design Automation Conference (DAC)*, San Francisco, CA, USA, 2021, pp. 1183-1188, doi: 10.1109/DAC18074.2021.9586194.
- [32] B. Pezeshki et al., "304 channel MicroLED based CMOS transceiver IC with aggregate 1 Tbps and sub-pJ per bit capability," *2024 Optical Fiber Communications Conference and Exhibition (OFC)*, San Diego, CA, USA, 2024, pp. 1-3.
- [33] Di Liang, John E. Bowers, "Recent Progress in Heterogeneous III-V-on-Silicon Photonic Integration," *Light: Advanced Manufacturing*, Vol. 2, no.5, pp. 59-83, 2021. doi: 10.37188/lam.2021.005
- [34] R. Jones et al., "Heterogeneously Integrated InP/Silicon Photonics: Fabricating Fully Functional Transceivers," in *IEEE Nanotechnology Magazine*, vol. 13, no. 2, pp. 17-26, April 2019, doi: 10.1109/MNANO.2019.2891369.
- [35] H. Hsia, J.Y. Wu, S.W.Liang, T.F. Tsai, S.W. Lu, C.W.Tseng, H.K.Chiu, C.C. Chang, C.H. Tung, C.S. Liu, K.C. Yee, and Douglas C. H. Yu, "EPIC-BOE: An Electronic-Photonic Chiplet Integration Technology with IC Processes for Broadband Optical Engine Applications" *International Electron Device Meeting (IEDM)*, San Francisco, CA, USA, Dec. 2024.
- [36] Teramount, "Self-aligning optics for large assembly tolerances" [Online]. Available: <https://teramount.com/technology/#b1> (Accessed Jan. 29.2025)
- [37] N. Psaila, S. Nekkanty, D. Shia and P. Tadayon, "Detachable Optical Chiplet Connector for Co-Packaged Photonics," in *Journal of Lightwave Technology*, vol. 41, no. 19, pp. 6315-6323, 1 Oct.1, 2023, doi: 10.1109/JLT.2023.3285149.
- [38] R. Marchetti, C. Lacava, L. Carroll, K. Grädowski, and P. Minzioni, "Coupling strategies for silicon photonics integrated chips [Invited]," *Photon. Research*, Vol. 7, pp. 201-239, 2019, doi: 10.1364/PRJ.7.000201
- [39] T. Barwicz et al., "Automated, self-aligned assembly of 12 fibers per nanophotonic chip with standard microelectronics assembly tooling," *2015 IEEE 65th Electronic Components and Technology Conference (ECTC)*, San Diego, CA, USA, 2015, pp. 775-782, doi: 10.1109/ECTC.2015.7159680.
- [40] A. Janta-Polczynski et al., "Solder-Reflowable, High-Throughput Fiber Assembly Achieved by Partitioning of Adhesive Functions," *2018 IEEE 68th Electronic Components and Technology Conference (ECTC)*, San Diego, CA, USA, 2018, pp. 1109-1117, doi: 10.1109/ECTC.2018.00170.
- [41] N. Boyer et al., "Automated Assembly of High Port Count Silicon Photonic Switches," *2020 IEEE 70th Electronic Components and Technology Conference (ECTC)*, Orlando, FL, USA, 2020, pp. 132-138, doi: 10.1109/ECTC32862.2020.00034.
- [42] IBM, "IBM assembly and test," [Online]. Available: [https://epic-photronics.com/wp-content/uploads/2021/08/Alexander-Janta-Polczynski\\_IBM.pdf](https://epic-photronics.com/wp-content/uploads/2021/08/Alexander-Janta-Polczynski_IBM.pdf) (Accessed Jan. 29.2025)
- [43] B. Snyder et al., "Broadband, Polarization-Insensitive Lensed Edge Couplers for Silicon Photonics," *2018 IEEE 68th Electronic Components and Technology Conference (ECTC)*, San Diego, CA, USA, 2018, pp. 841-847, doi: 10.1109/ECTC.2018.00130.
- [44] J. Luo, J. Henriksson, M. -K. Kim, D. Klawson, C. -Y. Fan and M. Wu, "Integrated Microlens Coupler for Photonic Integrated Circuits," *2023 Optical Fiber Communications Conference and Exhibition (OFC)*, San Diego, CA, USA, 2023, pp. 1-3, doi: 10.1364/OFC.2023.Tu3B.2.
- [45] A. Noriki et al., "Optical TSV Using Si-Photonics Integrated Curved Micro-Mirror," *2019 International 3D Systems Integration Conference (3DIC)*, Sendai, Japan, 2019, pp. 1-4, doi: 10.1109/3DIC48104.2019.9058779.
- [46] Y. Zhang, A. Samanta, K. Shang and S. J. B. Yoo, "Scalable 3D Silicon Photonic Electronic Integrated Circuits and Their Applications," in *IEEE Journal of Selected Topics in Quantum Electronics*, vol. 26, no. 2, pp. 1-10, March-April 2020, Art no. 8201510, doi: 10.1109/JSTQE.2020.2975656.
- [47] K. Kennes et al., "Collective die-to-wafer assembly process for optically interconnected System-on-wafer," *2024 IEEE 74th Electronic Components and Technology Conference (ECTC)*, Denver, CO, USA, 2024, pp. 1392-1397, doi: 10.1109/ECTC51529.2024.00227.
- [48] P. Xu et al., "Low-loss, multi-reticle stitched SiN waveguides for 300mm wafer-level optical interconnects," *2024 Optical Fiber Communications Conference and Exhibition (OFC)*, San Diego, CA, USA, 2024, pp. 1-3.
- [49] Heterogeneous integration roadmap 2023: chapter 9 Integrated photonics [Online]. Available: [https://eps.ieee.org/images/files/HIR\\_2023/ch09\\_photonics.pdf](https://eps.ieee.org/images/files/HIR_2023/ch09_photonics.pdf)



**Dr. Yu-Tao Yang** works in the Strategic Technology Exploration Platform of MediaTek USA, leading several projects, including heterogeneous system-level integration in Co-packaged optics, applied machine learning with heterogeneous integration, wireless/wireline communication, etc.

Dr. Yang received his Ph.D. in Electrical and Computer Engineering from University of California, Los Angeles (UCLA) in 2022. He received M.S. degree in Electrical Engineering in 2017 and B.S. degree in Undergraduate Honors Program of Nano Science in 2016. Both are from National Chiao Tung University (NCTU), Taiwan.

Dr. Yang has authored and co-authored 50+ journal articles and papers, including ACS Nano, International Electron Devices meeting (IEDM), Electron Device Letter (EDL), Transaction on Electron Devices (TED), and Transactions on Components, Packaging and Manufacturing Technology (TCPMT) etc., and has been invited for several talks, awards, fellowship, and scholarship.

> REPLACE THIS LINE WITH YOUR MANUSCRIPT ID NUMBER (DOUBLE-CLICK HERE TO EDIT) <



**Dr. Chih-Ming Hung** leads the Strategic Technology Exploration Platform of MediaTek researching various fields, such as RF/analog, wireless/wireline communication, low power circuits and systems, heterogeneous integration, etc. He also manages the global research partnership for MediaTek. Prior to his current role,

he was the CTO of the MediaTek Intelligent Automotive Business Unit. He also led the automotive and consumer mmWave radar development at MediaTek.

Between 2011 and 2014, Dr. Hung was an Associate Vice President at MStar Semiconductor responsible for its worldwide RF R&D Unit. During 2009 and 2011, he held a prestigious role in the Texas Instruments Kilby Labs overseeing and actively contributing to a wide range of projects spanning across analog, low-power implantable transceivers, and mmWave and sub-THz circuits and systems. Between 2000 and 2009, he was a key innovator at Texas Instruments developing Digital RF for wireless communication. Dr. Hung held an elected title of Distinguished Member Technical Staff at Texas Instruments.

Dr. Hung is currently an IEEE fellow, the ISSCC Wireless Subcommittee Chair, and an Associate Editor of IEEE OJ-CAS. He has authored and co-authored ~75 papers, ~150 patents, and has given several dozen invited lectures.

> REPLACE THIS LINE WITH YOUR MANUSCRIPT ID NUMBER (DOUBLE-CLICK HERE TO EDIT) <