What Is UCIe?

Device interoperability enables the multi-die system market.

popularity

The semiconductor industry is undertaking a major strategy shift towards multi-die systems. The shift is fueled by several converging trends:

  1. Size of monolithic SoCs is becoming too big for manufacturability
  2. Some SoC functionalities may require different process nodes for optimal implementation
  3. Desire for enhanced product scalability and composability is increasing

Multi-die systems are driving the need for standardized die-to-die interconnects. Several industry alliances have come together to define such standards, as shown in figure 1.

  • Optical Interface Forum (OIF) – The XSR and USR physical layer specifications optimized for die-to-die connectivity
  • Chips Alliance – The AIB specification which was originally introduced by Intel
  • Open Compute Platform (OCP) – The OpenHBI and Bunch-of-Wires (BOW) specifications optimized for different use cases
  • Universal Chiplet Interconnect Express (UCIe) – A comprehensive die-to-die interconnect specification covering multiple use cases and a complete protocol stack

Fig. 1: Several organizations have defined and developed standards for die-to-die interconnects.

This article takes a closer look at the UCIe specification and its main advantages.

Universal Chiplet Interconnect Express (UCIe)

UCIe is a comprehensive specification that can be used immediately as the basis for new designs, while creating a solid foundation for future specification evolution.

Contrary to other specifications, UCIe defines a complete stack for die-to-die interconnect, ensuring interoperability of compliant devices which is a mandatory requirement for enabling the multi-die system market.

UCIe roadmap and use cases

From the start, UCIe incorporates features that support multiple current and trending use cases. UCIe supports the present required data rate from 8Gbps/pin to 16Gbps/pin. UCIe is also expected to support flexible data rates up to 32Gbps/pin, which will be a requirement in future high-bandwidth networking and data center applications.

UCIe supports all types of package technologies in two ways:

  • UCIe for advanced packages (silicon interposer, silicon bridge or RDL fanout)
  • UCIe for standard packages (organic substrate or laminate)

Both options share the same architecture and protocols. The only difference is in bump map and PHY organizations. This difference means that system architecture, system validation, and software development can be re-used regardless of the chosen package type for a particular SoC.

UCIe supports novel resource aggregation (or pooling) architectures in data centers, either within the blade with flexible PCIe/CXL IO dies or rack-to-rack with UCIe-enabled optical IO dies.

Most importantly, UCIe supports compute scaling by leveraging streaming (user-defined) protocols to create low-latency connections between the Network-on-a-Chips (NoCs) of multiple server (or AI) dies in the same package.

UCIe specification overview

As shown in figure 2, the UCIe specification is divided into three stack layers: Physical Layer, Die-to-Die Adapter Layer and Protocol Layer.

  • Physical Layer is the electrical interface to the package media. It includes the electrical AFE (transmitter, receiver) as well as a sideband channel to enable parameter exchange and negotiation between two dies. It also includes the logic PHY which implements the link initialization, training and calibration algorithms, as well as test and repair functionality.
  • Die-to-Die Adapter Layer takes care of link management functionality as well as protocol arbitration and negotiation. It includes the optional error correction functionality which is based on a CRC and retry mechanism.
  • Protocol Layer implements one or several of the UCIe-supported protocols. Today, such protocols are PCI Express, CXL and/or streaming that are Flit-based protocols, offering maximum efficiency and reduced latency.

Fig. 2: The UCIe specification layering.

Physical Layer

The UCIe interface uses clock forwarding and single-ended, low voltage, DDR signaling to improve power efficiency. Power supply disturbances can be reduced by scrambling the data at the PHY level. Contrary to other techniques (like DBI), data scrambling does not impact bandwidth efficiency.

The receiver data recovery is greatly simplified due to clock forwarding in parallel with the data, leading to additional power and latency savings. Figure 3 shows a block diagram of the UCIe PHY architecture.

Fig. 3: Block diagram of the UCIe PHY architecture.

UCIe defines a module as the smallest interface unit. Each module includes a mainband “bus” up to 64 transmit and receive IOs for advanced package (or 16 for standard package), clock forward IOs, a valid (framing), and track IOs. A sideband “bus” is also implemented as shown in figure 4.

Fig. 4: The UCIe module implements a main band and a sideband bus.

To reduce yield loss due to ubump quality in advanced package assembly, UCIe offers a test and repair mechanism that relies on 6 redundant pins (for TX and RX data, clock, valid and track) and 2 redundant pins (for sideband TX and RX).

UCIe doesn’t implement pin redundancy for standard package since the C4 (or CuPillar) bump yield and complete assembly process yield are very high. For these packages, UCIe supports a “degraded” operating mode where only half of the module is active in case a failure is detected on the other half.

The test and repair process is implemented at link initialization. The PHY tests each die connection to determine if there is any failure. In case of failure, the corresponding signal is re-routed to a redundant pin as shown in figure 5.

Fig. 5: The Physical Layer tests each die connection to determine failure and re-routes signals to a redundant pin.

Table 1 shows the main differences between the UCIe specification for advanced packaging and standard packaging.

Table 1: Different UCIe PHY features for advanced packaging versus standard packaging. 

The differences are only noticeable at the electrical level and do not impact the upper protocol layers, as previously discussed. The differences derive from the significantly coarser minimum bump pitch required for standard packages (110µ) versus advanced packages (45µ) and from the need to support longer channel reaches in standard packages for added flexibility.

Die-to-Die Adapter Layer

The Die-to-Die Adapter Layer is an intermediate layer that interfaces any protocol to the UCIe PHY Layer. The Die-to-Die Adapter layer manages the link itself. At link initialization, it waits for the PHY to complete the link initialization, including calibration, test, and repair, at which time it initiates the discovery of both die capabilities. It will agree on which protocol will be used (in case several protocols are implemented) to hand over to the protocol layer for the mission mode activity.

The interface between the Die-to-Die Adapter Layer and Protocol Layer, called FLIT-aware Die-to-Die Interface (FDI) is a FLIT-based interface. To adapt to different protocols, it supports various FLIT modes:

  • CXL3 256B standard FLIT mode
  • CXL3 256B latency optimized FLIT mode
  • PCIe6 256B FLIT mode
  • CXL2 68B enhanced FLIT mode
  • Streaming 64B raw mode

UCIe also defines raw modes for CXL and PCI Express protocols. These modes are intended for retimer applications when UCIe traffic runs across an optical link. When in retimer mode, latency and error rates are not defined by the UCIe link itself and it is assumed that the Protocol Layer will take care of all the error correction mechanisms, including CRC, retry, and possibly FEC. The Die-to-Die Adapter layer does not add CRC codes into the protocol FLIT and does not check for errors or applies the retry mechanism on the receiver.

Protocol Layer

UCIe maps common protocols, like PCI Express and CXL, enabling developers to leverage previous work on software stacks and simplify the adoption of in-package integration using multi-die architectures. UCIe expects standardization of other protocol mappings in its future releases.

UCIe also enables the mapping of other protocols via the streaming mode. For example, low-latency connections between NoC fabrics on two compute dies can be supported with CXS or AXI bridges to the FDI interface in streaming mode. Other user-defined protocols can be implemented in the same way, taking advantage of the Physical Layer and Die-to-Die Adapter Layer link management features.

When implementing a UCIe interconnect, the architect may choose to support one or more of these protocols. Implementing multiple protocols enhances the applicability of the die in different use cases, a real advantage in the context of an open multi-die system marketplace. The Die-to-Die Adapter Layer is responsible for the discovery and selection of which protocol to use in a given interconnect.

Conclusion

The UCIe specification brings together very competitive performance advantages to multi-die system designers, including high energy-efficiency (pJ/b), high edge usage efficiency (Tbps/mm) and low latency (ns), support for the most popular IO protocols as well as any user-defined protocols, compatibility with all types of package technologies from organic substrates to advanced silicon interposers, and covering all the critical aspects of the interface (initialization, sideband, protocol, test and repair, error correction, etc.).

The advantages of UCIe makes it a very compelling technology poised to ease the path toward a truly open multi-die system ecosystem by ensuring interoperability.

The UCIe promoters outlined a compelling roadmap to support the industry’s new use cases and requirements. The promoters expect UCIe to support higher data rates and new protocols, 3D packaging, and other aspects of multi-die system design such as form factors, security, testability, etc.

Find more information on a comprehensive multi-die system solution to make it easy for designers transitioning to multi-die system architectures below:



Leave a Reply


(Note: This name will be displayed publicly)