AI Drives Need For Optical Interconnects In Data Centers

Old protocols are evolving as new ideas emerge and volume of data increases.


An explosion of data, driven by more sensors everywhere and the inclusion of AI/ML in just about everything, is ratcheting up the pressure on data centers to leverage optical interconnects to speed up data throughput and reduce latency.

Optical communication has been in use for several decades, starting with long-haul communications, and evolving from there to connect external storage to server racks, and then to servers within those racks. More recently, it is being eyed for communication between different modules and advanced packages, and ultimately it could end up as a way of speeding up data movement inside of advanced packages with minimal heat and power.

The challenge has been integrating photons with electrons. Interconnects at that level historically have been clunky and somewhat unreliable, but they are improving rapidly as the industry recognizes the need for optical technology. This is already happening in hyperscaler data centers, which are more uniform in design than enterprise data centers and have no single point of failure. “In an enterprise data center, you see daily changes,” said Mark Seymour, distinguished engineer at Cadence. “You don’t see that in a hyperscaler. Also, in a hyperscaler, if you have a failure, it’s not significant because they just transfer the workload.”

That helps explain why the hyperscalers consistently have been at the leading edge of optical technology. But with more AI/ML, and more data to process everywhere, the need for lower latency, lower power, and lower heat inside all types of data centers is becoming pervasive.

“Inside the data centers, the optical connectivity can be divided into two types,” said Jigesh Patel, technical marketing manager at Synopsys. “One is inter-rack, and the other is backplane or intra-rack. Typically, multi-mode fiber is used for both. Silicon photonics is rapidly replacing traditional copper interconnects used to provide on-board and on-chip connectivity because of better thermal, spectral and energy efficiencies of the photonics. The industry is gradually moving away from the system-on-a-chip design approach to a system-of-chips in a single package.”

Alongside of that shift, the industry’s switch from copper to fiber is just basic math. “A copper cable running at 200 gigabits per second is going to have a loss,” said John Calvin, senior strategic planner for IP wireline solutions at Keysight. “For example, a 1 meter cable is going to have about 34 dB of loss. That’s 50:1 signal loss. You start out with a 50 millivolt signal on the transmitter. You’re now down to 1 millivolt at the receiver. That’s the problem. That loss is basically electrical signal that’s been burned up into heat, and that’s what’s killing the data centers. [By contrast], you can send an optical signal tens of kilometers on a fiber and it’s down 1 dB.”

Breakdown of interconnect choices

Copper options:

  • Direct attached copper (DAC) has a high-density connector on both ends, with a copper differential transmission line connecting the two connectors together. At the high speeds found in data centers, DACs are good for approximately two meters of interconnect.
  • Active copper cable (ACC) can handle mid-range lengths between copper and optical more cost-effectively than optical.

Optical options:

  • Single mode fiber, with a core diameter of 8 to 10µm
  • Multi-mode fiber, with a core diameter of 50 to 100-plus mm

While “multi-mode” might sound more complex, it’s actually single mode that’s trickier to work with and costlier, even disregarding the cost differentials from the fiber alone.

“Lasers and other components used in single-mode fiber-based systems are more expensive,” Patel noted. “Since the fiber’s core diameter is much smaller, coupling tolerances between transmitter and fiber, and between fiber and photodetector, are tighter compared to the tolerances in multi-mode fiber-based systems. On the other hand, single-mode fiber offers higher bandwidth compared to multi-mode fiber, which means a larger amount of data can be carried longer distances in single-mode fiber-based transmission.”

Multi-mode fiber has remained popular for the most pragmatic of reasons.

“A multi-mode fiber you can pull off of a spool and cleave it, stick it into connector, and you’re good to go,” Keysight’s Calvin explained. “Single-mode fiber takes precision alignment and optics, and is actually a very precision cut. Typically, you buy single-mode optics from the vendor with a particular fit application in mind. There’s no spool of single-mode fiber in your data center where you pull off an amount that you need for this interconnect. That’s what drives people crazy. They want the flexibility, and a multi-mode fiber, while it’s less efficient and not capable of going as far as a single-mode fiber, is a data center operator’s best friend because it’s nice, easily scalable, and flexible to use. We keep wondering when multi-mode is going to die, and it never does because of its flexibility.”

There are other modifications to these basics, including single-mode fiber with Dense Wave Division Multiplexing (DWDM), which splits the spectrum to provide more bandwidth and broadcast the signal across longer distances, and coherent optics, which mixes and amplifies signals.

“If you implement 802.3 CT or CW, it’s a coherent optical link, which is the most efficient use of the optical spectrum and can connect mirror (backup) data centers 40 Km or more away,” Calvin noted.

The next link in the communications chain presents its own set of challenges.

“In order to get data in and out, there’s a small portion of that connectivity, which is just straight copper cables, but that has a pretty short reach associated with it — no more than a few meters,” explained Manish Mehta, vice president of marketing and operations for optical systems at Broadcom. “If you want to go any more than a few meters from the switch to another appliance, you have to transmit optically. The way you do that today is with pluggable optical transceivers. As a reference point, each of these transceivers is 400 gigabits per second of bandwidth, and one of these switches can have up to 32 plugged in. That’s a 12.8-terabyte switch. One of the core components of the transceiver is a semiconductor laser, and then there are some ICs that drive that laser. But there are a lot of small mechanical components needed in order to hold different mechanicals down. For example, to get fiber from the laser to the front of the module, requires that you have strain relief so you can operate under tough environmental conditions.”

This leads to a scaling problem, which is driving some of the innovation in the industry, such as co-packaged optics.

“The 10 million or so of these that are purchased every year by the hyperscalers are almost all manually assembled at factories throughout Asia,” Mehta said. “The hyperscalers see that as unscalable, especially over time, as copper becomes less and less able to transmit data at the reaches necessary in a data center. Every time you go through a speed generation, you go from a 100Gb SerDes to a 200Gb SerDes and beyond, you’re reducing the reach of copper just by the laws of physics. And there’s more optical connectivity that’s required in a data center. So that’s the problem that has to be solved. The amount that’s being spent on these optical transceivers now dwarfs the ASICs, and they are not the most reliable of devices at an industry-wide level. The hyperscale paradigm for dealing with this is if one doesn’t work, they pull it out and plug in another one. There is an absolute need for the siliconization of optics.”

Fig. 1: The present and future of optical interconnects. Source: Synopsys
Fig. 1A: The present and future of optical interconnects. Source: Synopsys

Fig. 1B: Broadcom’s design for co-packaged optics. Source: Broadcom

Fig. 1B: Broadcom’s design for co-packaged optics. Source: Broadcom

Rethinking interconnects
Broadcom’s solution is to take eight 800 Gb transceivers and consolidate them into a single optical 6.4T optical engine (OE) and integrate them on a common substrate with the switch ASIC in order to provide all of the optical connectivity that’s needed for a system. For a 51.2T switch you would then need eight of these 6.4T optical engines.

Several start-ups are also tackling optical interconnect problems. Lightmatter, for example, offers a communications layer, which is placed in-between a substrate and an ASIC. That provides more layout options because it solves such problems as the attenuation of electrical signals.

Richard Ho, vice president of hardware engineering at Lightmatter explained this communications layer functions like an OCS (optical circuits switch), but on silicon. “Basically, when we configure it, we have a direct point-to-point connection from any one of our nodes within our communications layer to all the other nodes in that layer, and you can reconfigure it dynamically. You can set it so they go all-to-all communication, or you can set it so it’s a ring, or set it into a 3D hypertoroid. There are all these types of configurations because it is in the silicon, and we have a way to control it and we can control where the light goes. But once we set it a certain way, it becomes like a wire. It just goes straight through to the other location — really fast. And so you’re basically able to reconfigure the wires inside your computer package in a very short amount of time. It’s unique technology. You can’t do this electrically. You can only do this with silicon photonics.”

Fig. 2: Passage communications layer. Source: Lightmatter

Fig. 2: Passage communications layer. Source: Lightmatter

Celestial AI has an optical interconnect solution, as well, which the company claims is more thermally stable than the ring resonators currently used in transporting optical signals, and which should allow closer communication with ASICs. “We’re providing optical connectivity all the way to the point of compute,” explained Celestial AI CEO David Lazovsky. “A lot of the companies that are in the co-packaged optics space work on a wire-forwarding technology, meaning you send me a signal electronically and all I’m going to do is convert that to optics and drop it off in an electronic signal at the other end of the fiber. At Celestial AI, we are a full stack solution. We have a protocol adaptive layer to provide compatibility with our customers’ existing infrastructures.”

Fig. 3: Photonic fabric configurations. Source: Celestial AI
Fig. 3: Photonic fabric configurations. Source: Celestial AI

Direct optical wiring
While most of the innovative work in photonics has centered around engineering components like optical transceivers or refining laser outputs, Korean start-up Lessengers has developed a new type of fiber as part of an approach it dubbed “direct optical wiring” (DOW). The material is currently being sold as part of the company’s solutions for HPCs, but is being considered for possible licensing in the future.

“This material is room temperature operation capable, which means most of it is liquid at room temperature and solidified during the wiring process,” said Taeyong Kim, Lessengers’ chief marketing officer. “The shape of the wire can be easily controlled into a wider shape, or different types of shapes, by changing a set of parameters such as the mechanical dimension of the wiring tip, wiring speed, ratio of mixture, etc. The user optimizes the wiring recipe and the in-house DOW machine does the rest.”

Fig. 2: Direct Optical Wiring. Source: Lessengers
Fig. 4: Direct optical wiring. Source: Lessengers

Ethernet and PCIe
Whether copper or optical, the PHY uses a protocol that just celebrated its 50th anniversary — IEEE 802.3 Ethernet. “Ethernet has grown beyond everyone’s expectations and is really the fabric that’s used in data centers to connect everything,” said Calvin. “Ethernet packets are great because they scale exceptionally well.”

Even its proponents admit that it can creak with age, but the industry is working on fixes. “It’s true that Ethernet brings a bit of baggage with it. It’s a heavily loaded protocol and it’s evolved over 50 years, so it’s not nearly as lean and mean as some of the more contemporary protocols,” said Calvin. “The Ultra Ethernet consortium is going to tweak Ethernet to make it go faster. Still, at the end of the day, you don’t want to stray too far away from Ethernet because when the protocol of the signal goes in and out of a data center, it’s going to be Ethernet.”

PCIe is the current protocol of choice for copper interconnects. However, as the standard evolves, it’s possible copper will lose favor.

“While mostly copper is used for PCIe versions up to 6, limitations in terms of bandwidth, latency, and energy consumption are already obvious fueling immense interest in the linear, direct-drive optical engine,” said Patel. “The benefits gained from optical engine include higher bandwidth, lower energy consumption because of no need for re-timers, low latency, and lower cost due to elimination of digital signal processing (DSP) in pluggable modules.”

There are other options, as well. “You have a choice whether you transmit as a serial link or as a parallel link, and that’s where SerDes comes in,” Patel said. “It can be either fully electrical SerDes, or it can be linear drive SerDes where there is also an optical engine within the SerDes hardware, that is where optical interconnect comes in. Copper can transmit to PCI, 4, 5, or even 6, but at 6 the limitations of copper are already visible.”

That limitation is visible in energy consumption, which increases data center cooling requirements. “The next version of PCIe, PCIe 7, is due to release in 2025. The supported data rate per lane will be 32GB/s,” he said. “Many in the industry believe that in order to support such a high data rate, the use of optics is inevitable. In fact, a recently formed workgroup at PCI-SIG is looking into delivering PCIe technology over optical connections.”

Richard Solomon, senior staff technical product manager at Synopsys, offered another glimpse into the future, “CXL uses the PCIe transport. Once optical PCIe becomes more common, CXL will build right on top of it. Some of the drivers causing the PCI SIG to want to go after optical is usage models where I’m going to transport CXL from box to box around my data center. Nobody thinks optical PCI and CXL are going to be anything more than 10 meter reach. But it would still be a huge advantage in a data center if I could go top-to-bottom of rack, or rack-to-rack. Imagine all those things connected with closer to DRAM latencies. Wow.”

New ideas emerging
Along with the reliance on older protocols and the persistence of multi-mode fibers, the industry is moving forward.

In August, OIF announced the External Laser Small Form-Factor Pluggable (ELSFP) Implementation Agreement (IA), which defined a front-panel pluggable form factor tailored to co-packaged optical systems and other multiple laser external laser source applications.

According to the OIF, IA includes definitions for the placement of laser sources at the front panel, the coolest section of the system, enhancing system reliability and allowing for efficient “hot-swap” field replacement when necessary.

“These are used for the emerging high-density optics that are commonly found in applications such as artificial intelligence and machine learning,” said Calvin. “Basically, these are the emerging interconnects that are used inside hyperscale data centers that are moving the needle in this industry right now with extremely low-latency, high-speed, high-performance requirements.”

With evolving standards and physics, and push from the AI/ML market, optical interconnects will continue to evolve. Analyst firm Light Counting predicts the optical components market will hit $20 billion by 2027, with growth in all sectors — including that old standby, Ethernet.

1. Ultra Ethernet Consortium,
2. LightCounting. “Optical components market to hit $20 billion by 2027, despite weak 2023.” LightTrends Newsletter, October 2022.

Related Reading
New Standards Push Co-Packaged Optics
Speed, density, distance, and heat all need to be considered; pluggables still have a future.
Processor Tradeoffs For AI Workloads
Gaps are widening between technology advances and demands, and closing them is becoming more difficult.

Leave a Reply

(Note: This name will be displayed publicly)