Linear Pluggable Optics Save Energy In Data Centers

New OIF electrical standards will enable interoperability, adding another option for faster and more efficient data movement.

popularity

Linear pluggable optics (LPO) is garnering more attention as a way to quickly and efficiently move data in and out of server racks, but a lack of standards for connecting the optical modules is slowing adoption at a time when there is growing pressure to reduce power in data centers.

LPO is the newest of two approaches to solving the power wall problem in data centers. Co-packaged optics (CPO),  a technology whose development precedes LPO, moves optics out of an octal small form-factor pluggable (OSFP) module and into the electrical-component package, eliminating the pluggable module. LPO, in contrast, moves only the DSP functionality out of the pluggable module and places it into a top of rack (ToR) switch so that electrical signals can directly drive the module.

Although LPO saves less power than CPO, one advantage is that it provides better protection from thermal effects, which can cause optical signals to drift, than CPO does.

“You have to control the temperature of the device in the rack,” said Dick Otte, CEO of Promex. “We’ve seen cases where people are trying to control the temperature of optical devices to as fine as one-tenth of a degree centigrade.”

The challenge with LPO is how to ensure that these modules are interoperable. But that’s about to change. The Optical Internetworking Forum (OIF) currently is preparing electrical standards to boost interoperability.

Fiber moves further into the data center
Although fiber has long been employed for intercontinental and other long-haul communications, its utility is growing with shorter links. It has been used to interconnect different buildings within a data center campus for a while, and now it’s being deployed to interconnect server racks.

“Optical communication/interconnect has continually scaled down in distance while moving closer to the ASIC [that’s sending and receiving signals]. We have seen the transition from long-haul to metro to local-area networks and then into the data center,” said Suresh Jayaraman, senior director of package development at Amkor Technology. “More recently we have seen a flurry of activity in HPC.”

Copper still pervades within the rack, but fiber connects the ToR switches. The leaf-and-spine network architecture is popular in the data center, with the ToR switch acting as the leaf switch and using fiber to connect to other racks through the spine switches.

Fig. 1: Leaf-spine network architecture in a data center. Servers within a rack connect to a top-of-rack) switch that communicates with other racks over fiber through spine switches. Source: Bryon Moyer/Semiconductor Engineering

“Whenever you have to go from one rack to another rack, you hop through a switch,” said Priyank Shukla, director of product management for interface IP at Synopsys. “This is typically where an optical interface is used.”

Fiber predominantly enters the switches through OSFP modules. “The fiber goes to the top of the racks,” said Otte. “There is relatively little optics within a rack today, and that is sort of one of the next big steps that we’re going to see.”

Those OSFP modules plug into the switch at the front plate. But the signals driving the fiber originate further inside the switch, so an electrical path is necessary between that source and the plug at the edge. That path is typically an electrical SerDes link, and it contributes significantly to energy consumption.

The electrical signals that the SerDes carries drive a DSP chip in the OSFP module, and that chip cleans up the signal, including retiming it, performing equalization, and implementing forward error correction (FEC) and other functions intended to improve reception at the far end of the fiber.

“Electro-optical interfaces today are based on linear retimed interfaces because the signal you are receiving here gets conditioned in such a way that you recover a clock,” said Shukla. “An optical module generally has one CMOS chip with a physical layer and the digital signal processing.”

That DSP chip then converts the digital signals to analog, which drives the optical engine, which modulates a light signal and transmits it. Upon receiving a signal, the DSP retimes and cleans up the received signal before sending it to the switch over the SerDes link.

Fig. 2: Simplified traditional optical switch connection. Signals remain in the digital domain as processed in the switch ASIC and sent across the SerDes to the edge of the switch where an optical module is plugged. Source: Bryon Moyer/Semiconductor Engineering

One DSP will do
Although this arrangement works, the DSP consumes a significant amount of energy. Its power is roughly 50% of that of the entire pluggable module. That makes it a target for savings.

Meanwhile, the switch ASIC has its own DSP for sending signals onto and receiving them from the SerDes. “As the technologies have evolved, the switch SerDes has so much DSP that the functions you have [in the pluggable module] can be implemented by the switch itself,” said Shukla.

In other words, the switch’s DSP has become powerful enough to handle both its tasks and the ones being performed in the pluggable module. The promise of LPO is that the DSP in the module is eliminated. What remains is some basic equalization and a transimpedance amplifier (TIA) — and a significant power reduction. “Your switch is directly driving the optics,” said Shukla.

Fig. 3: LPO pluggable module. The DSP functionality once in the pluggable module is now in the switch ASIC. What remains in the pluggable module is conversion to analog, the TIA, and continuous-time linear equalization (CTLE). Source: Bryon Moyer/Semiconductor Engineering

Most LPO implementations today are in closed systems, where a single company can ensure that all the pieces work together to maximize the power reduction. “Broadcom has publicly shared that they have about 35% power saving,” said Shukla.

But for interoperable pluggable optics, LPO in its current form can be difficult. The characteristics of the electrical signals arriving at the module have not yet been specified to guarantee that any LPO module will work in any socket.

“The challenge of this approach is interoperability,” said Shukla. “The whole premise of the data center is you take one component from one vendor and you can replace it with others. But now, because you don’t have this retiming function, the interoperability between these two has become challenging.”

Standards for interoperability
Three standardization projects addressing LPO interoperability are underway. First, there’s now a draft multi-source agreement (MSA) for LPO. MSAs aren’t technically standards. They’re contractual agreements between vendors to produce mutually interoperable components (in a manner that presumably avoids collusion concerns). So they’re de facto standards, and they’re common in the telecom world.

Beyond that, the OIF has taken up two projects to establish standards that companies not part of the MSA can follow. The first, called the Common Electrical I/O – 112G-Linear Project, sets up electrical standards to ensure interoperability. In addition to LPO, it also serves CPO and near-package optics (NPO) applications. CPO on its own doesn’t require it, however. “LPO defines linear optics for a pluggable form factor, whereas CPO suggests co-packaging electrical and optical dies together,” said Shukla. “It does not mandate linear/non-retimed implementations.”

A team from the OIF — Jeff Hutchins, vice chair, physical and link layer working group, energy-efficient interfaces at OIF; Yi Tang, vice chair, physical and link-layer working group, electrical at the OIF; and Nathan Tracy, president of the OIF — explained the group’s motivation. “The OIF initiated its linear project prior to the LPO MSA with the specific objective to define an electrical interface that would allow interoperable non-retimed optical modules to enable system-level power savings,” they said.

The second project, called the Retimed TX Linear RX (RTLR) Project, sets up a retimer for transmission only. The lack of a retimer in LPO constrains the distance that signals can run while still being recoverable. The old architecture retimes the received signal to clean it up after its journey, but the new retimed transmit/linear receive (RTLR) approach retimes the SerDes signal in the pluggable module prior to sending it out on fiber. That is intended to improve the quality of the fiber signal, allowing it to be recovered without further retiming on the receive side.

“State-of-the-art RX SerDes can equalize channels,” said the OIF team. “Many of the SerDes use linear equalization techniques, but optical modulators can introduce some nonlinear distortion. This would suggest that signal distortions remaining before the [TX] electro-optical conversion will be somewhat non-linearly transformed. Linear RX techniques therefore cannot compensate for all of that. RTLR cleans up the signal prior to conversion and therefore should improve interop and reach. In addition, the presence of the TX retimer should make the overall system more tolerant to variations between the host and module and provide an opportunity for including diagnostic capability over LPO, something that the hyperscalers like to see.”

RTLR puts another block back into the module, which will raise power somewhat. While the OIF doesn’t provide power numbers, given the dependency on technology and vendors, it did note some numbers quoted by a well-regarded figure in the industry. “Andy [Bechtolsheim] at Arista has publicly presented 18 pJ/b for [traditional] retimed, 12 pJ/b for RTLR, and 6 pJ/b for non-retimed,” said the OIF team. “Others suggest slightly better or worse numbers, e.g. 18 – 15 pJ/b for retimed.”

Fig. 4: RTLR pluggable module. This is similar to LPO, but a retimer in the pluggable module cleans up the transmit signal prior to converting to optical. The signal quality should then be good enough to receive it without retiming. Source: Bryon Moyer/Semiconductor Engineering

What’s unusual about these standards is that, typically, electrical standards are set by various IEEE groups while the OIF specifies optical standards. “The OIF is defining electrical parameters for optical compliance,” said Shukla. “This is the first time the ecosystem is saying, ‘If you do this electrically, you will need this optical spec.’”

As to spec availability, the linear spec is close to ready. “It’s in draft stage, but it’s a technically complete draft,” said Shukla. The OIF wasn’t specific on expected availability, however. RTLR will follow the linear one, and it will have multiple parts. “RTLR has been separated into two efforts, one at 112G and another at 224G,” said the OIF team. “The 112G will be available sooner than the 224G.“

Ahead of, but not replacing, CPO
The question remains as to how LPO and CPO will coexist once CPO achieves commercial traction. CPO has taken the spotlight for years with the promise of eliminating the SerDes link altogether and saving yet more energy. But it still faces development challenges, and the newer LPO is likely to be commercialized ahead of it.

Nevertheless, given the promise of yet more energy savings, work on CPO is expected to continue. “Linear CPO will definitely save more power than LPO as the channel between electrical and optical die is extremely small in CPO,” said the OIF team.

Meanwhile, LPO should be able to run initially from a half to a full meter — plenty to get from one part of the PCB to the other. But it may also be improved to run many meters, making it usable more widely within a data center campus. Whether CPO will eventually take over is unclear. Experience with LPO ahead of CPO’s availability should help determine whether there are markets for both.

References:

Cover art credit: Photography by Christophe Finot, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=50571220

Related Reading
Silicon Photonics Manufacturing Ramps Up
The promise of photonics ICs is spurring innovation, but complex processes and a lack of open foundries are keeping it from reaching its full potential.
Photonics Could Reduce The Cost Of Lidar
Known as the workhorse technology for long-haul communications, photonics now plays a very different kind of role.



Leave a Reply


(Note: This name will be displayed publicly)