Real-time inference and large-scale training are pushing the limits of interconnect performance.
By Madhumita Sanyal and Diwakar Kumaraswamy
The rapid escalation of AI/ML workloads—driven by increasingly large language models—is reshaping high-performance computing and AI data center architectures. Real-time inference and large-scale training are pushing the limits of compute and interconnect performance. With model sizes and parameter counts doubling every 4–6 months, infrastructure must provide extreme computational throughput along with unprecedented bandwidth and ultra-low-latency communication across thousands of accelerators.
This article examines the system-level challenges in enabling seamless 1.6 Tbps port-level interoperability for next-generation SoCs. It focuses on the role of 224G SerDes, emerging interconnect standards, and the stringent signal and power integrity requirements in dense, high-speed designs.
Modern LLMs such as Llama 3 require up to 700 TB of memory and 16,000 accelerators for pre-training. No single GPU or accelerator can meet these requirements, necessitating tightly coupled clusters of tens of thousands of devices. This scale places extreme stress on the network fabric, which must support both scale-up (intra-rack, low-latency) and scale-out (inter-rack, high-bandwidth) topologies.
To address these demands, new protocols have emerged:
Both protocols are built atop a new generation of physical layer technology: 224G SerDes.
Fig. 1: AI scaling architectures with 1.6T Ultra Ethernet & UALink.
To ensure that 224G SerDes solutions are interoperable and reliable across the industry, standards bodies such as IEEE and OIF are actively developing comprehensive electrical and long-reach (LR) specifications, with ratification expected by 2025. Additionally, the Ultra Ethernet version 1 of the spec was recently released and UALink 200G was announced earlier this year. These standards are crucial: they provide a common framework that allows components from different vendors to work together seamlessly, which is essential for the rapid deployment and scalability of modern data center infrastructure.
At 224G, the Nyquist frequency is doubled compared to 112G, which dramatically increases the impact of channel loss and crosstalk. Every element in the signal path—PCB traces, connectors, and packaging—contributes more loss at these higher frequencies. For example, 32 AWG Twinax cable can exhibit around 14 dB/m insertion loss at 56 GHz, and total channel loss in a typical system can easily reach 40–50 dB. This level of attenuation makes traditional PCB-based routing insufficient for many high-speed links, driving the adoption of advanced materials, improved connector designs, and alternative approaches such as flyover cables to preserve signal integrity.
Maintaining robust data transmission at 224G requires advanced digital signal processing (DSP) and equalization techniques to compensate for severe channel impairments. Modern SerDes architectures incorporate:
These receiver blocks work together to ensure open eye diagrams and low bit error rates (BER), even in challenging environments, with pre-FEC BER targets orders of magnitude better than 1E-4. The DSP must be adaptable, supporting both short-reach (chip-to-module) and long-reach (backplane, flyover, or optical) channels.
At these speeds, every block in the SerDes architecture must be enhanced. The analog front end (AFE) requires increased bandwidth, and analog-to-digital converters (ADCs) must deliver lower noise performance. The phase-locked loop (PLL) jitter budget is tightened due to reduced unit interval (UI), and the DSP must provide robust equalization—often leveraging advanced techniques such as Maximum Likelihood Sequence Detection (MLSD)—to compensate for 45 dB of channel loss. Importantly, these improvements must be achieved without a proportional increase in power consumption, even as switches and accelerators integrate 200+ SerDes lanes.
Fig. 2: Synopsys 224G architecture advancements were discussed at ISSCC 2024. You can learn more here: 7.3 A 224Gb/s 3pJ/b 40dB Insertion Loss Transceiver in 3nm FinFET CMOS | IEEE Conference Publication | IEEE Xplore
Before hardware is available, system implementers rely on comprehensive simulation environments to predict and optimize performance. These simulations model the entire signal path, including:
The simulation environment enables designers to evaluate signal integrity, crosstalk, and the impact of simultaneous switching across many lanes. By constructing a virtual system—transmitter to receiver, including all intermediate interconnects—engineers can forecast whether the receiver will observe an open eye diagram and acceptable bit error rate (BER).
Fig. 3: Signal Integrity (SI) analysis is required to ensure errorless 224Gbps signal transmission from TX to RX via interconnects (e.g., package, PCB, connectors, backplane, etc.).
A primary metric is the pre-FEC BER, with specifications typically targeting better than 1E-4. However, robust system design requires margin across process, voltage, and temperature (PVT) variations. Simulations must also assess the effectiveness of forward error correction (FEC), comparing pre-FEC and post-FEC BER across all PVT corners to ensure reliable operation under worst-case conditions.
Once silicon is available, these models are validated against real hardware. For example, with 224G SerDes, system channels with 40–45 dB of loss are characterized using actual silicon, interconnects, and cables. Both near-end and far-end crosstalk are measured, and the results are compared to simulation predictions to close the loop on model accuracy.
Fig. 4: PHY SI models (TX/RX IBIS-AMI, die S-parameters) along with interconnect models, preferably in S-parameter formats, should be used in SI analysis.
SerDes performance cannot be evaluated in isolation. The full system—including interconnects, cables, connectors, packaging and PCB—must be considered.
In order to work in an HPC environment, it is not enough to design the SerDes with IEEE or OIF spec and test against the electrical specification for TX compliance or RX JTOL. SerDes need to work with ecosystem vendors to provide a pre-tested, pre-verified solution for the system integrators and enable seamless integration when they are deployed to form the rack connectivity. For instance, interop with high-density cable assemblies, OSFP pluggable, 1-2m direct attach copper (DAC) cables, and near-chip NPC, CPC assemblies from various vendors with up to 45-50 dB reach provide the groundwork needed on the system validation of the real-world scenarios within the rack and rack-to-rack connectivity path. These interop tests confirm that the SerDes and interconnects, package, PCB – end-to-end channel work together to deliver the required performance on an HPC system.
Fig. 5: C2M VSR & LR Rack-to-Rack connectivity with 224G SerDes.
As the industry looks beyond 224G, the transition to 448G SerDes is already underway. This next leap will require not only doubling the data rate but also rethinking modulation schemes and channel definitions to address the unique challenges at these frequencies.
These channel behaviors will determine the PHY modulation and complexity, and vice versa.
Fig. 6: 448G system with the interconnect topology for interoperability.
The move to 448G will intensify demands on every aspect of the system, from advanced materials and connector technologies to even more sophisticated DSP and equalization techniques. Power delivery, thermal management, and system integration will all require further innovation to maintain signal integrity and energy efficiency at these unprecedented speeds.
Achieving 1.6 Tbps port-level interoperability in next-generation HPC and AI/ML SoCs is a multidisciplinary challenge that extends well beyond SerDes design. Success requires:
With 224G SerDes now in production and 448G development underway, the industry is positioned to deliver the bandwidth, latency, and scalability needed for the next generation of AI and HPC. Continued advancement will depend on holistic system engineering, robust standards, and a focus on real-world interoperability.
The Synopsys 224G PHY IP addresses these demands with state-of-the-art design, simulation, and measurement methodologies, delivering signal integrity and jitter performance that exceeds IEEE 802.3 and OIF specifications and supports UALink 200G deployments.
Diwakar Kumaraswamy is a senior staff technical product manager at Synopsys.
Leave a Reply