Co-Packaged Optics Reaches Power Efficiency Tipping Point

But blazing fast data speeds come with significant manufacturing challenges.

popularity

Commercialization has started for network switches based on co-packaged optics (CPO), which are capable of routing signals at terabits per second speeds, but manufacturing challenges remain regarding fiber-to-photonic IC alignment, thermal mitigation, and optical testing strategies.

By moving the optical-to-electronic data conversion as close as possible to the GPU/ASIC switch in data centers, CPO significantly boosts bandwidth and reduces the power consumption needed to run generative AI and large language models. Implementing co-packaged optics is expected to slash energy costs for training AI models and dramatically increase the energy efficiency of data centers.

“While today’s AI accelerators, GPUs, and high-capacity networking switches are pushing the boundary of compute capabilities at a rapid rate, they’re constrained by the interconnect bottlenecks at the chip level, board level, tray level, and rack level,” said David Clark, vice president of product marketing at Amkor Technology. “CPO breaks down these limitations by delivering 1 Tbps/mm bandwidth density, enabling higher front panel port density and optimizing valuable rack space in increasingly crowded data centers.”

In data centers today, network switches in a rack of computers consist of GPU/ASIC chips that are electrically connected through a PCB to pluggable, optical transceivers in the front of the rack. These transceivers combine lasers, optical circuits, DSPs and other electronics. The devices link electrically to the switch and optically to optical fibers that traverse through the data center.

This approach is effective, but it’s inefficient. The electrical traces on the circuit board consume significant power and limit the speed and density of data transfer due to signal loss and constraints on pin count and crosstalk. That’s where optical interconnects come in.

“Due to low transmission loss over fiber, optical signaling enables increased reach and is commercially standardized and common at the board and rack level in the form of pluggable optical I/Os,” said Mark Gardner, vice president and general manager of the Advanced System Assembly & Test Business Group at Intel. “In today’s pluggable optical I/O modules, the optical I/O signaling engine is outside the package of the switch/compute node. Therefore, the bottleneck of bandwidth, energy efficiency, and latency remains due to the electrical connections between the compute/switch/FPFA node and the optical engine.”


Fig. 1: Process steps for co-packaged fabrication and assembly. Source: ASE

Key optical components in CPO include laser transmitters, photodetectors, waveguides, modulators, and silicon photonic integrated circuits (PICs). The modulator, typically a micro-ring or Mach Zehnder modulator, converts electrical signals to optical signals, while also controlling the delivery of those optical signals.

“Replacing the optical engine in pluggable transceivers, silicon photonic-based optical I/O chiplets are typically designed with dense wavelength division multiplexing (DWDM). That allows data bandwidth scaling per fiber port,” Gardner said. “Additionally, these chiplets are becoming smaller in size due to advances in silicon photonic device miniaturization, allowing their co-integration with the compute node in an advanced package. The co-integration reduces the electrical signaling distance to as low as 100µm, thereby unblocking the bottlenecks of power, bandwidth density, and latency that are pertinent in the off-the-package electrical signaling.”

In one CPO configuration, the compute chip is surrounded by 4 or 8 silicon photonic IC transceiver chiplets. Those will be packaged together, with the exception of the laser, which typically is kept separate because it is the least reliable component. “The primary advantage of co-packaged optics lies in its ability to significantly reduce the power consumption associated with high-speed data transmission from ~15 pJ/bit with pluggable modules to ~5 pJ/bit (with a projected path to <1 pJ/bit),” said Vikas Gupta, senior director of product management at GlobalFoundries.

Co-packaged optics also improves signal integrity because shorter optical signal paths have lower parasitic losses. “By packaging the optical engine directly with the switching chip, the distance electrical signals must travel is drastically reduced,” said Sander Roosendaal, R&D engineering director at Synopsys Photonics Solutions. “This reduction in electrical trace length means the SerDes (serializer/deserializer) components need to handle much lower signal loss (1 to 2 dB compared to more than 20 dB in standard designs). While a photonic IC-based transceiver (which is central to CPO) can offer around 10 times the performance of a traditional transceiver, despite being a similar size, the co-packaging itself directly addresses the electrical interface bottleneck that limits pluggable solutions, providing the critical jump in power and performance required for future data-intensive systems.”

Fiber alignment challenges
Despite its benefits, however, co-packaged optics faces multiple manufacturing challenges including achieving excellent fiber-to-PIC alignment accuracy. In CPO, the optical fiber and the photonic IC are co-integrated in the same package using passive or active alignment processes. Accurate alignment of the fiber to the waveguide facet on the chip is essential for efficient coupling of optical signals. The most common passive alignment process uses V-grooves.

“Techniques such as the V-groove approach offer the lowest loss interfaces by connecting the fiber directly (and permanently) to the PIC,” said GlobalFoundries’ Gupta. “Index-matching materials and adhesives are used to minimize losses due to changes in the refractive index along the light transmission path. While detachable fiber solutions enhance the repairability of the fiber interface, the light typically travels through various turning mirrors and material interfaces, adding approximately 1dB of loss per fiber interface.”

In fact, the challenge of connecting optical fibers at scale is one of the key factors that has kept CPO from entering high-volume manufacturing. “Connecting tiny silicon waveguides on a chip to external optical fibers is one of the most difficult tasks in packaging integrated optical devices,” said Mitch Heins, business development manager at Synopsys. “Efficiently coupling light between a standard single-mode (SM) optical fiber and a silicon-on-insulator (SOI) waveguide is very challenging. This difficulty arises because the fiber and the SOI waveguide have very different refractive index contrasts, sizes, and shapes in their cross-sections, leading to a mismatch in how light is distributed within them. A typical SM optical fiber is much larger, with an 8µm to 10µm diameter, while an SOI waveguide might be just 500nm x 220nm. This scale difference is like trying to align a basketball-sized pipe to a pea-sized tube, which can cause the majority of the light to be lost. Besides the basic mode mismatch, the waveguide facet must be very well polished, and the alignment between the fiber and the SOI waveguide itself is a critical factor affecting coupling loss.”

Heins explained that active alignment uses external manipulators, or a precision alignment system with six degrees of freedom, to move the fiber or fiber array around while the optical power is transmitted through the waveguide or PIC. When the maximum optical power is reached, the fiber is permanently attached to the waveguide.

“The first challenge is getting the fiber and photonic IC to align within a couple tenths of a micron to minimize signal loss. You try to get the loss down to on the order of 1 dB, which is achievable,” said Dick Otte, CEO of Promex. “The second challenge is holding the fibers in place over the long term. That’s a matter of the stability of the physical structure and the epoxies or acylates that are holding the fiber in place. Many people are still using the V-groove approach, which is a pretty good scheme that is well-documented. What is changing is we’re now aligning arrays as opposed to just single fibers, and that’s been an important advance. It reduces the cost per alignment greatly. I anticipate the number of arrays will increase dramatically as the data rates go up.”

But moving from single-mode fibers to fiber arrays brings significant alignment challenges. “For multi-channel devices, like arrays of fibers coupling to arrays of grating couplers, the alignment process requires careful adjustment to ensure the entire array is correctly positioned and parallel to the chip features,” said Synopsys’ Heins. “Automated systems often use optical feedback to first find the light signal and then perform a gradient search to optimize coupling efficiency across multiple fibers simultaneously. This can involve complex scan patterns using precision stages.”

Heins pointed to specifications of 0.1µm alignment for low power loss, <50nm traverse alignment tolerance, and 3D alignment for fiber arrays. The mechanical tolerances for package-level optical connections are extremely tight.

“The typical optical features that allow for precision alignment may include structures such as V-grooves fabricated on the PICs, or micro-optical elements such as mirrors or lenses to allow for optical signals to be routed from fibers to PICs,” said Intel’s Gardner.


Fig. 2: Two possible CPO configurations show different positioning of fiber array unit, which affects optical coupling. Source: ASE

Thermal mitigation
Just like electronic ICs, photonic ICs are sensitive to temperature changes.

“Thermal fluctuations due to high-power devices in the package, such as GPU, ASIC, or switch die, can fluctuate the temperature of the photonics devices in the co-packaged PIC,” said Intel’s Gardner. “The fluctuations can impact functionality and performance of the photonic devices, such as the ring resonators and modulators. These devices are sensitive to temperature changes and often work best when kept within a temperature window. Unintended temperature changes due to the integration environment can result in the resonance shifting, which in turn can result in performance or functionality degradation.”

Temperature fluctuations may seem tiny, but the impact is significant. “In most photonic systems, a shift in temperature of 1° C typically results in a wavelength shift of ~0.1nm,” noted Amkor’s Clark. “In today’s systems, most implementations use single wavelength and micro-ring modulator architecture, which have relatively low or manageable sensitivity to thermal effects. However, as CPO continues to develop, bandwidth demand continues to rise with the need to reduce the fiber bundles. We could see the introduction of dense wavelength division multiplexing (DWDM) architectures. In this case, temperature and wavelength stability become far more critical and will present new packaging challenges.”

At the package level, stacks of thermal interface materials are chosen carefully to reduce the PIC temperature fluctuations and keep them within a pre-defined range. “We also consider what thermal management approaches may be required to keep the optical components within their temperature windows — even when considering large changes in overall thermal boundary conditions and/or package level power distributions,” Gardner said. “Inside the PIC or its accompanying EIC, sensing and control circuits IP are implemented to maintain performance and functionality in the presence of the PIC temperature range.”

GlobalFoundries’ Gupta concurred regarding the thermal mitigation schemes. “The proximity of the optical interface to large thermal sources presents challenges that must be carefully managed. Most co-packaged optics systems use external lasers due to wavelength shifts and reliability concerns of compound semiconductor-based light sources under high temperatures. Mechanical design and characterization of optical interfaces, such as fiber attachment, must account for differences in thermal expansion between silicon and organic or polymeric materials. Additionally, on-die devices need to be characterized and qualified at higher temperatures (>105°C) to ensure optimal performance. Modulators have local heaters for tuning wavelengths of interferometric and resonant devices. While photodiodes may exhibit higher dark current at elevated temperatures, they are designed to mitigate reliability concerns.”

There are other challenges, as well. “The failure of the laser is still the largest single cause of defects that I’m aware of in these systems,” said Promex’s Otte. “So the issue around ensuring known good die means largely means known good lasers. People are working to burn them in. And the requirements will get harder over the next few years with multiwavelength lasers.”

Design for reliability
Ensuring reliability is a challenge with any new technology, but it’s particularly difficult when it involves multi-die integration. “The need for a known good die/module (KGD/KGM) has become crucial as optical interfaces are placed on the same board or interposer as the ASIC/xPU,” said GlobalFoundries’ Gupta. “Electro-optical functional test platforms and testing to establish KGD/KGM are active research areas. Large test platform companies made announcements in this area at the Optical Fiber Communications (OFC) conference earlier this year. While significant strides have been made in electrical testing, optical testing solutions for fast (and non-permanent) alignment of optical fiber probes are still being developed.”

As the industry transitions away from pluggable transceivers to co-packaged optics, built-in reliability is more important than ever, especially given the high cost of compute chips. “Instead of relying on the ability to quickly replace a failed unit with pluggable modules, co-packaged optics addresses failures by focusing on enhancing the intrinsic reliability of components and packaging, designing in redundancy, implementing integrated monitoring and self-correction,” said Synopsys’ Roosendaal. He describes these strategies as:

  1. Designing for High Reliability: The silicon photonics components themselves, like passive devices, Germanium photodetectors, depletion modulators, and integrated heaters, are being developed to have high intrinsic reliability, with many showing very low failure rates (e.g., below 1 Failure in Time – FIT). Hybrid-attached III-V lasers on silicon have also demonstrated reliability meeting standards like Telcordia GR468. Achieving high reliability in packaging, including assembly processes, materials (like adhesives), and structures, is critical and has been demonstrated through rigorous testing like JEDEC stress tests involving thermal cycling and damp heat.
  2. Incorporating Redundancy: Since replacing a failed component is difficult, the design incorporates backup features. A key example is the inclusion of redundant lasers. If a primary laser degrades or fails, a spare laser can be switched in, often automatically, to maintain operation. This switching can be very fast, minimizing downtime. Similarly, for complex photonic integrated circuits, extra components can be included as a backup as part of the design and manufacturing process. Using laser arrays, where failure of one laser only affects a fraction of the link, also can offer higher reliability compared to a single-point-of-failure source like an optical frequency comb. A fault-tolerant design with low component stress levels is considered essential.
  3. Integrated Monitoring and Self-Correction: More advanced designs include integrated monitors and control electronics that can detect performance degradation or failures. For instance, the degradation of an active laser can be monitored, triggering the switch to a redundant one. Built-in self-test (BiST) is also incorporated, where possible, to check electronic connections and functions. The use of non-volatile components like memristors potentially could allow for post-fabrication error correction for certain photonic devices.
  4. Emphasis on High Manufacturing Yield and Early Testing: Given the complexity and integrated nature, ensuring that components and the assembled module work correctly before deployment is paramount. Testing becomes critical at various stages, including wafer-level and die-level testing, to identify and remove defective parts as early as possible. If fault coverage is insufficient at early stages, a complex multi-die assembly like CPO can experience catastrophic yield losses at the module level. Evaluating failed parts helps improve earlier processes.

GlobalFoundries’ Gupta agreed that for photonic devices, the Telecordia GR-468-CORE is typically used for reliability assessment. “However, with silicon photonics packaging technologies becoming more CMOS-like, GlobalFoundries also uses JEDEC-based reliability specifications. Photonic devices need to be characterized at extended temperatures (>105°C). Silicon, as a material system, is inherently more reliable than some compound semiconductor solutions.”

2.5D vs. 3D integration
For now, both 2.5D and 3D packaging approaches are being pursued for co-packaged optics. In 2.5D, the EIC and PIC sit side-by-side on a silicon interposer, through which they are electrically connected. Copper pillar micro-bumps and through-silicon vias provide interconnections.

“An added advantage of the interposer is the option to further integrate waveguides, gratings or filters to couple in the optical signals,” said Amkor’s Clark. “Formation of these optical features are typically highly compatible with front end CMOS foundry processes, constructed using conventional silicon nitride, silicon dioxide or even polyimide layers.”

3D CPO takes advantage of new processes such as hybrid bonding, with the electronic IC atop of the photonic IC for thermal reasons. But Intel breaks down the options even further.

“When evaluating PIC, EIC strategies for CPO, there are two primary approaches — monolithic integration, where the photonic (PIC) and associated electronic circuits (EIC) are co-fabricated on the same die, versus 3D die-stacked integration, where PIC and EIC are fabricated separately and then 3D bonded,” said Gardner. “Monolithic-PIC (same die having PIC and EIC) integration in an EMIB package with xPU (2.5D) offers tight electrical coupling with minimal parasitic within the die between the PIC and EIC circuits, resulting in better power efficiency and reduced latency. It also simplifies thermal aspects and packaging configuration. However, the monolithic PIC restricts the use of leading-edge nodes for electronic IC, which is important for I/O bandwidth density scaling.”

3D adds other benefits, as well. “Die-stacked integration of PIC and EIC allows each die to be fabricated on its optimal process — electronic ICs on advanced CMOS nodes and photonics on high-performance platforms like silicon photonics. This yields good performance in each domain and offers greater design modularity and reuse,” said Gardner. “The tradeoff is increased complexity in assembly, thermal management, and higher packaging cost due to advanced technologies such as TSV, HBI, etc. The 3D PIC/EIC stack can then be integrated with xPU in advanced packages with EMIB, resulting in a 3.5D CPO solution.”

Conclusion
Co-packaged optics is a promising frontier in advanced packaging that brings much needed gains in bandwidth and energy efficiency to power-hungry data centers. Fortunately, many of the technologies that apply to silicon electronics also apply to silicon photonics.

Even so, to produce these advanced packages at scale requires accurate and precise means of aligning fibers to photonic ICs with very low signal loss, advanced thermal management strategies, testing methods for optical components, and a level of built-in self test and redundancy to ensure high reliability under the continual operating conditions of AI data centers.



Leave a Reply


(Note: This name will be displayed publicly)