There is an unavoidable latency due to error correction and the physical cable length.
Since its 1980s debut with 10Mbps shared LANs over coaxial cables, Ethernet has seen consistent advancements, now with the potential to support speeds up to 1.6Tbps. This progression has allowed Ethernet to serve a wider range of applications, such as live streaming, Radio Access Networks and industrial control, emphasizing the importance of reliable packet transfer and quality of service. With the current Internet bandwidth humming at ~500 Tbps, there’s a growing demand for improved back-end intra datacenter traffic handling. Although individual servers are not yet operating at Terabit-per-second levels, the overall datacenter traffic is nearing this scale, prompting the IEEE’s 802.3dj group to undertake standardization efforts and necessitating robust Ethernet controllers and SerDes to manage the expanding data flow. Amidst this backdrop of escalating demands, interprocessor communication is already pushing to these speeds.
Interprocessor communication is spearheading the need for 1.6T rates with minimal latency. While individual devices are restricted by their inherent processing capacities and chip size, combining chips can significantly extend these capacities. The first generation of applications is expected to be followed by intra datacenter switch-to-switch connections, enabling the pooling of high-performance processors and memory, boosting scalability and efficiency within cloud computing.
Compliance to the evolving standardization efforts is pivotal for seamless ecosystem interoperability. The IEEE’s 802.3dj group is in the process of formulating the upcoming Ethernet standard, which encompasses physical layers and management parameters for speeds from 200G up to 1.6 Terabits per second. The group’s objective is a 1.6 Tbps Ethernet MAC data rate, aiming for a maximum bit error rate of no more than 10-13 at the MAC layer. Further provisions include optional 16 & 8 lane Attachment Unit Interfaces (AUI) suitable for different chip applications, leveraging 112G and 224G SerDes. Physically, the 1.6Tbps specification entails transmission through 8 pairs of copper twinax cables for up to one meter and 8 pairs of fiber for distances reaching between 500 meters and 2 km. Although the standard’s complete ratification is anticipated by spring 2026, the core set of features is projected for a 2024 completion.
Fig. 1: Diagram depicting the components of a 1.6T Ethernet Subsystem.
In earlier Ethernet iterations, the PCS primarily focused on data encoding for reliable packet detection. However, with the escalation to 1.6T Ethernet speeds, the need for Forward Error Correction (FEC) becomes evident, particularly to counteract signal degradation over even short links. For this purpose, 1.6T Ethernet continues to utilize Reed-Solomon FEC. This approach builds a codeword comprising 514 10-bit symbols encoded into a 544 symbol block, resulting in a 6% bandwidth overhead. These FEC codewords are distributed across the AUI physical links so that each physical link (8 for 1.6T Ethernet) doesn’t carry an entire codeword. This method not only gives additional protection against error bursts but also enables parallelization at the far end decoder, thereby reducing the latency.
The Physical Medium Attachment (PMA), featuring a gearbox and SerDes, brings the Ethernet signal onto the transmitted channels. For 1.6T Ethernet, this involves 8 channels each running at 212Gbps, accounting for a 6% FEC overhead. The modulation technique employed is 4-Level Pulse Amplitude Modulation (PAM-4), which encodes two data bits for each transmission symbol, thereby effectively doubling the bandwidth when juxtaposed with the traditional Non-Return Zero (NRZ) approach. The transmission mechanism relies on digital-to-analog conversion, while on the receiving end an analog-to-digital conversion combined with DSPs ensures accurate signal extraction.
Furthermore, it’s important to note that the Ethernet PCS introduces an “outer FEC” that spans end-to-end on an Ethernet link. To bolster longer reach channels, an added layer of error correction for individual physical lines is in the pipeline, likely adopting a hamming code FEC. This correction is anticipated to find its primary application in optical transceiver modules where such correction is imperative.
Fig. 2: Diagram showcasing additional overhead added when using a concatenated FEC for extended reach.
In the example system depicted in figure 2, the MAC and PCS are connected via an optical module and a fiber stretch. The PCS has a bit error rate of 10-5 at the optical module link, plus errors from the optical link itself. Using a sole RS-FEC end-to-end wouldn’t suffice to achieve the 10-13 Ethernet standard, rendering the link unreliable. An option would be a triple implementation of separate RS FEC on every hop, increasing costs and latency significantly. A more effective solution is the integration of a concatenated Hamming Code FEC specifically for the optical link, catering to the typical random errors of optical connections. This inner FEC layer creates an additional expansion of the line rate from 212 Gbps to 226 Gbps, thus it is essential that the SerDes can support this line rate.
Fig. 3: Latency path for 1.6T Ethernet Subsystem.
Various components contribute to Ethernet latency: the transmit queue, transmission duration, medium traversal time, and several processing and receipt times. To visualize this, consider figure 3, which displays a comprehensive 1.6T Ethernet subsystem. While latency can be influenced by the reaction time of the far-end application, this factor is external to Ethernet and therefore often excluded during latency analysis. Minimizing latency at the Ethernet interface requires understanding the specific circumstances. For example, latency may not be a primary concern for trunk connections between switches due to inherent delays on slower client links. Distance also plays a role; greater lengths introduce more latency. Of course, this doesn’t mean that we should overlook latency in other scenarios, reducing latency is always an objective.
Transmission latency is inherently tied to the Ethernet rate and the frame size. Specifically, for a 1.6T Ethernet system, transmitting a minimum-sized packet necessitates 0.4ns – essentially, one Ethernet frame per tick of a 2.5 GHz clock. On the other hand, transmitting a standard maximum-sized frame takes 8ns, extending to 48ns for Jumbo Frames. The chosen medium further dictates latency. For instance, optical fiber typically incurs a latency of 5ns per meter, while copper cabling is marginally faster at 4ns per meter.
A substantial segment of the overall latency is rooted in the receiver controller. The RS FEC decoder inherently introduces latency. To initiate error correction, the system must receive 4 codewords, which, at 1.6Tbps, amounts to 12.8ns. Subsequent activities, including error correction and buffering, amplify this latency. While the FEC codeword storage duration remains consistent, the latency during message reception is contingent upon the specific implementation. Nevertheless, latency can be optimized by employing meticulous digital design strategies.
In essence, there is an inherent, unavoidable latency due to the FEC mechanism and the physical distance or cable length. Beyond these factors, design expertise plays a pivotal role to minimize Ethernet controller latency. Leveraging a complete solution that integrates and optimizes the MAC, PCS and PHY, paves the way for the most efficient, low latency implementation.
Fig. 4: First-pass silicon success for Synopsys 224G Ethernet PHY IP in 3nm process showcasing highly linear PAM-4 eyes.
1.6 Tbps Ethernet is tailored for the most bandwidth demanding and latency sensitive applications. With the emergence of 224G SerDes technology, in conjunction with advancements in MAC and PCS IP, comprehensive solutions are now accessible that continuously conform to the evolving 1.6T Ethernet standards. Additionally, due to the latency intrinsic to the protocol and error correction methods, the IP digital and analog design must be diligently crafted by expert designers to avoid introducing unnecessary latency into the datapath.
Achieving top performances for 1.6T SoC designs requires an efficiently optimized architecture and meticulous design practices for every chip component. This emphasizes power conservation and minimizes the silicon footprint, making 1.6T data rates a reality. Silicon-proven Synopsys 224G Ethernet PHY IP has set the stage for the 1.6T MAC and PCS Controller. Using leading-edge design, analysis, simulation, and measurement techniques, Synopsys continues to deliver exceptional signal integrity and jitter performance, with a complete Ethernet solution including MAC+PCS+PHY.
Leave a Reply