Anatomy Of A (Better) Gaming Platform

New Xbox 360 uses 45nm process, providing room to consolidate multiple components into a single SoC.


By Pallab Chatterjee
Microsoft’s third-generation Xbox360 engine uses a 45nm silicon on insulator (SOI) process—and a new architecture.

The original design was on 90nm and then migrated to 65nm. In both of these cases the fundamental architecture of the system remained the same—a CPU (central processing unit) chip, a GPU (graphics processing unit) chip, and memory management chip for the front-side bus (FSB) to the DRAM.

The size and power reduction possible with a 45nm process changed that architecture. With the ability to now integrate 347 million transistors on a single die, the new design is a single-chip CPU, GPU and FSB memory controller. The design features three CPU cores with L1 and L2 cache, a GPU core with GDDR3 memory interface, a video-out controller, a PCIe interface and a FSB manager. The chip is placed as part of a CPU module that includes a HSIO (high-speed input/output) interface to an EDRAM die.

Pallab graphic2

Fig. 1: Changes at 45nm

Filling out the system is a South Bridge block that contains an interface to the system Flash, HDD, Optical Disk Drive, USB, IR Remote, and wireless 802.11n circuits. This architecture was chosen partly on the needs of the system and partly on the capability of the process. At the 45nm node and on an SOI process the design can be implemented with a method to minimize leakage and standby current. That allows Microsoft to bring these simultaneous high-power blocks onto a single monolithic die. The architecture includes system-level temperature sensors and control, as well as block-based power down. Using the capabilities of the SOI, the power down is implemented with multiple power domains.

The chip now has 6 PLLs that support a total of 12 clock domains. The design uses an adaptive power supply system (APS), which results in 8 power domains. These separate the memory, CPUs and GPUs into different power domains, as well as the I/Os and interface logic. In order to facilitate interconnection to these blocks, the design used C4 pads on a 35mm PC-PBGA with a 3-2-3 buildup of layers. This reconfiguration resulted in a total net power reduction of more than 60% from the original design and more than 50% reduction in silicon area.

As the major blocks of the SoC came from different sources, the final chip was built using three different design methodologies. The CPUs were built using a semi-custom design methodology that supported synthesizable macros, full-custom macros, an 18-track, high-performance base library, a full independent clock grid and transistor-level timing analysis. The GPU area was build using an ASIC-style standard cell methodology—using a 12-track high-density (as opposed to high performance) base library, only synthesized macros, a combination of traditional H-Tree and full clock grid, and then finalized with gate level timing analysis.

The overall chip and the block infrastructure were built using a full-chip hierarchical methodology that called the CPUs and GPUs as “hard macros” in the top level of the design. The timing was performed hierarchically and partitioned along the blocks and paths. The hierarchical nature required a mix of device-level and gate-level verification of the signals based on their criticality. At this top level, the chip design for test was put in and pushed down hierarchically though the blocks including the hard macros for the CPUs and GPU.

The top-level logic verification had a challenge from the new architecture: It had to be backward-compatible to support the existing library of games and have no change in game play. As a result, the design used sequential equivalence to validate the design. This is comparing the corresponding sequential path outputs from the two different design representations and making sure they are the same in both performance and function.

The advanced process provides for loss of capabilities in a new SoC, but with dramatic changes in device architecture and the incorporation of multiple design flows and tools to complete the task.