From smartphones to AI factories, physical layers are the unsung heroes of data communications.
Over the past couple of decades, the semiconductor industry has evolved from a supporting role for traditional verticals like mobile, automotive, and PCs to a foundational role in those markets, as well as in AI factories and hyperscale data centers. Underlying this transformation is the physical layer (PHY), which has emerged as a critical enabler for data transfer and communications.
The PHY is a key component of the Open Systems Interconnection (OSI) model, which contains seven abstraction layers for connecting different systems and defining how they communicate with each other and share data. That model was developed by the International Organization for Standardization (ISO) in the early 1980s.
“The physical layer in the ISO seven-layer model can be abstracted as to what takes care of the actual physical back-and-forth of bits and bytes and/or signals,” said Marc Swinnen, director of product marketing at Ansys. “It can be radio, it can be a wire, it can be optical, it can be anything. It takes care of the physics. And the layers above that, at some point, don’t care whether it was done optically or electrically or through radio. That’s the whole point of this layered model. Each layer can ignore the details of what’s below it. There are lots of physical interface standards, including Bluetooth, Ethernet, Wi-Fi, UCIe, PCIe, etc.”
That physical layer has become increasingly important in data centers, where huge volumes of data need to be processed, stored, and moved back and forth. AI and HPC workloads demand an unprecedented level of system performance, and that requires massive bandwidth, ultra-low latency, and energy efficiency at scale, according to Arif Khan, senior product marketing group director for design IP in the Silicon Solutions Group at Cadence. “These requirements are not merely compute challenges. They are fundamentally interconnect challenges. This is where SerDes and PHY IP take center stage.”
But as systems transition from clean binary logic to the complexities of physical devices, they encounter the constraints of the natural world. “State transitions are not instantaneous, limiting physical bandwidth, while background noise further impacts channel capacity,” Khan noted. “Seminal research from Claude Shannon and Harry Nyquist established the foundational principles that define the maximum capacity of a channel, given its signal-to-noise ratio and encoding profile.”
One standard, but multiple options and applications
Understanding the physical layer is essential to keeping up with system demands and for remaining competitive in these domains. “In systems and SoCs, we see USB separate from PCIe and internet [communication protocols],” said Hezi Saar, executive director of product management at Synopsys and chair of the board of directors at MIPI Alliance. “Those standards were created to solve specific problems — either a connector or long reach or PCI adaptation or Ethernet networking, etc. Standard compliance gives us assurance that things are working properly and interpreted. And because you have developed your previous generation, you now can get a time-to-market advantage. There are differences between those standards, and now we see more and more physical layers being invented, to answer the question of, ‘Why can’t we get all of that together, or more together?’ It’s possible, and we’re sometimes doing those combos, but they come at an expense. Standards allow vendors to not compete, because they come up with a spec together, but they’re also able to differentiate. They can make products lower power, reduce the system costs because they integrate more things, or externally less things.”
This is evident with HDMI, for example. “Some laptops have HDMI connectivity,” said Saar. “Sometimes they have DisplayPort connectivity. Or they have both. For them to do that, they ask, ‘Do I implement my SoC with HDMI and DisplayPort?’ HDMI comes from the TV market. DisplayPort comes from the PC display market, the monitors market. How do I put in both, because I want to connect my laptop to TV at home, or I want to connect to the monitor in my business? I want to have this duality. I can create a combo HDMI/DisplayPort physical layer where the electricals are similar, but it costs more in the implementation and in the PPA in general. Or I could make a more compact implementation and get a bridge chip that is external, but that adds cost to it. Cost is external to the SoC, but it gets you the functionality. That kind of SoC can go after the low-cost market, which only needs DisplayPort, let’s say. The SoC that needs to go after the high-end market needs HDMI and DisplayPort. So this is where it comes into play of whether I need to have one or two.”
Wherever data is processed and stored, it needs a PHY. As physical layer interconnects are being developed with additional features to take in requirements for areas of application beyond mobile, such as machine vision, PC/mobile compute, automotive and industrial, it means the physical layer must be a top-of-mind consideration. The reason is that for many of the systems targeting these applications, low to extreme low power is mandatory. For those applications are battery operated, in particular, low heat dissipation is a must.
“Whether it is mobile, AR, VR, MR, XR, IoT, smart glasses or mobile computing, power dissipation and the generated heat need to be minimized,” said Ashraf Takla, founder and CEO at Mixel. “Otherwise the commercial success of the product is in jeopardy. The system designer needs to pay much attention to how the input and output of the different components of the system communicate together, and consider the most efficient to ways to communicate the data with minimum total power, heat dissipation, while in many cases minimizing the number of wires. Without paying attention to physical layer early on, the system designer risks ending up with a system where most of the power is spent on the communication between the different system components. That would surely result in an uncompetitive solution.”
At the same time, as data bandwidth requirements continue to rise, the physical layer is affected.
“USB and Ethernet in the last 20 years or so are approximately 100X or 200X the bandwidth,” Saar explained. “The SerDes technology we used to do before was much simpler. It’s more NRZ (non-return-to-zero), so more open eye, which now is familiar. But there was already a paradigm shift from NRZ to PAM (pulse amplitude modulation), where it’s more challenging to stick to the same kind of multimode levels and add in more content. Back in about 2000, we had simpler linear equalization. The rate was known, open eye on the RX side, so you could check that. But as the rates are increasing, the shift is around 20 to 30, almost 40. We are gaining from a base architecture, which is SerDes to more of PAM4. That is really more of a DSP-oriented approach.”
All of this development has accelerated the standards that are being introduced. “There’s more demand for more computation, and more computation means more cores in the same SoC, stacked in the SoC, and that is not in servers only, it’s even in the edge,” Saar said. “More computation for AI demands even more bandwidth. Getting the data in and out to do the computation becomes very important, and if we were progressing in the NRZ level, the serialization would not catch up to the data rate that we need. That’s really where the PAM4, PAM8, and beyond are coming into the market. More functionalities require faster interface speeds, and that’s why we see progression all the time, faster and faster.”
Designing ultra-fast PHYs
However, designing PHYs operating at speeds exceeding 100G presents a myriad of challenges.
“Engineers must navigate a landscape dominated by cutting-edge technologies like PAM4 signaling, sub-picosecond jitter, and channel losses that would have been insurmountable just a decade ago,” Cadence’s Khan explained. He pointed to four key PHY design challenges include:
Multi-die assemblies and advanced packaging add other considerations.
“When you look at chip-to-chip communications, if it’s not 3D, it’s just a PCB board,” said Ansys’ Swinnen. “It’s a regular bus network on board. But if you look at chip-to-chip, they’ve come up with their own physical standards. The most quoted and used is UCIe, which has been given out to the public domain. There are other standards, as well, such as Bunch of Wires. Each has pros and cons, but the point is to get the highest bandwidth at the lowest power possible. That is the key, because when you’re looking at a 3D system, you’ve disaggregated it. Where it used to be an SoC, now you have multiple dies, and you’d typically have to pay a penalty for that disaggregation. There’s a penalty in speed and power as you get off chip through these PCB lines, back on chip as buffers, drivers, big wires. There’s a big penalty in power and speed, which was always the driving factor as to why people integrated, because you got a huge boost in avoiding that physical interconnect. Why it’s become popular now is because the density of our pitches and the wires that connect them offer enough bandwidth, enough shoreline. You know how many bumps you can put. There’s enough shoreline because of the pitch. The wires are thin enough because they’re using interposer with like 65nm technology, or 35nm technology to fabricate the interposer. So you can actually get high-speed, high-bandwidth, low-power connectivity in the chiplets, and that is what is enabling this disaggregation — along with not having to pay too big of a price for that.”
The physical layer has a direct interface to the physical effects, said Andy Heinig, head of the Chiplet Center of Excellence at Fraunhofer IIS’s Engineering of Adaptive Systems Division. “This means you are often going to analog voltage domains or analog signals. And for this we have two domains. They have to work together. On one side are the analog engineers. On the other side, there may be digital engineers. You have to bridge the gap between two totally different domains. This is often hard to do. We see in our teams that there are problems that analog and digital guys can really work together. Analog people are focused on solving the analog problems, but they often forget how it works in the system. And finding the right levels of extraction makes this physically complicated. On the other side, if you can improve something on the physical layer, you get a lot of performance out of that. But again, it’s an interaction between the levels on top of the physical layer and the physical layer itself, because you can also co-optimize here. You may also be able to tolerate some errors on the physical layer if you have enough correction in the protocol layer, and vice versa. You can move it forward and back, and sometimes it’s not co-optimized because of standards. Then, you will lose overall performance because everything is optimized individually, not as a whole system.”
What to keep in mind
The first consideration for the PHY is ensuring which standard best fits the application, said Mixel’s Takla. “Is the data communication symmetrical or asymmetrical? What are the tradeoffs between the number of lanes versus the data rate of each lane? How important is it to minimize the number of wires? Is multi-drop communication needed for the application? How do those choices affect power and heat dissipation? What are the implications for latency and power up time? Are the system’s physical interface layer choices compatible with the physical interface layers that the system needs to externally communicate with?”
At the chip level, IP providers are largely agnostic about physical layer protocols for interfaces to and from the chips that its cores are embedded in. Still, Steve Roddy, chief marketing officer at Quadric, noted that SoC and systems designers to accurately model data traffic generated by complete applications running on the processor cores. The model profile data is used by system designers to make informed decisions about logical and physical layers of interfaces in new systems, but it’s generally not involved in those activities by customers.
Once the system model is created, major physical effects can be introduced into that model so the correlation between the physical effects and what it means on the system level is understood. “Then you can better understand which problems can cause issues or have a big influence, what can be ignored, what is only a second-order effect, and where you have to spend your main efforts to solve those,” said Fraunhofer’s Heinig. “This is something you can figure out with a system model, such that you have more insights with the system model instead of only doing that on the analog, for instance. This is something we explain to all of our analog engineers — focus on this analog to optimize the analog part. They always have to look more at the system level so they better understand the influence of their decision on the system, and if they change something on the system level, ‘this’ happens on the analog side. That includes all the physical effects — electronic, thermal, and mechanical effects.”
Conclusion
As the industry advances toward 448G and beyond, challenges will only intensify, especially with the advent of chiplet disaggregation, optical I/O, and AI-native architectures. “The PHY layer is no longer just a pipe. It has become a strategic enabler,” said Cadence’s Khan. “Meeting these demands will require continued innovation and a steadfast commitment to pushing technological boundaries.”
Leave a Reply