“Eating Your Own Dog Food” When Developing An Emulator

Emulation in supercomputing, ARM TechCon, and your own domain.

popularity

It’s a great week for emulation week with ARM TechCon happening in Silicon Valley. Palladium Z1 is a finalist for Best Product in the categories “Best Chip” and “Best System” and we started the week with an announcement that Fujitsu adopted the Cadence Palladium Z1 Enterprise Emulation Platform for their ARMv8-based “Post-K Supercomputer Development.” Cadence has faced some of the same challenges that supercomputing faces, and it turns out that we have “eaten our own dog food”—we used emulation to develop our latest emulator Palladium Z1.

The challenges involved in developing the next-generation flagship supercomputer in Japan, the Post-K computer, are staggering. It will reach performance targets that are a maximum of 100 times faster than the K computer, which is the original 10-petaflop supercomputer that started full-scale operation in 2012. Fujitsu decided to use the Palladium Z1 enterprise emulation platform exclusively for its advanced server and supercomputer development.

How ubiquitous emulation has become as a key pillar for verification from IP to subsystems, SoCs and systems, also became clear just last week at DVCon in Europe. Hobson Bullman, General Manager, Technology Services Group at ARM, described their need for earlier software development and flexibility in verification between FPGA, emulation and software simulation. In his presentation, he counted the required verification cycles in Gigacycles for simulation, Petacycles for emulation and Exacycles for FPGA based prototyping…

In a presentation called “Overcoming the Challenges of Verifying Hardware and Validating Software for the Development of High Performance Super Computers” at CDNLive, Fujitsu described the development of the current K-computer and how Palladium XP at the time helped them to perform software-driven verification and software validation. They disclosed how about 100,000 high-performance CPUs are densely connected through their own interconnect chip. System boards with four nodes are combined into racks with approximately 100 nodes, and then connected into whole systems with about 100,000 nodes. A complex software stack executes on top of the hardware, and with software-driven verification in emulation, test programs were executed and allowed tuning of the hardware and improvements in network throughput.

Having been closely involved in the development of Palladium Z1, this looks very familiar. The Palladium hardware combines custom Cadence bit-level processor IP and Cadence peripheral IP into a system on chip (SoC) with significant memory content. Several SoCs are then integrated on boards into logic drawers yielding emulation capacity of 32 million gates, which are then integrated into clusters of six drawers yielding 192 million gates emulation capacity. Three clusters can be integrated into racks of standard datacenter format yielding 576 million gate emulation capacity per rack. Up to 16 racks can be combined using optical interconnect to yield 9.2 billion gates accessible to up to 2304 parallel users.

A software stack of 3+ million lines of software code executes on and is connected to the Palladium Z1 hardware. Firmware enables bare metal execution, connections, and interfaces, and a run-time operating system controls the Palladium Z1 during execution, including complex job scheduling and resource management. An application software stack and GUI enable key functionality like hardware software debug, in-circuit emulation, verification acceleration, low-power verification, dynamic low-power analysis, virtualization of real-world interfaces, etc.

z1-hierarchy

Back in their presentation at CDNLive, Fujitsu disclosed that about 35% of bugs were detected using software-driven verification in emulation, 61% in simulation and formal, and 4% in the actual prototype.

Similarly, we at Cadence used emulation to develop our next-generation emulator. Palladium Z1 was emulated on the previous generation emulator, Palladium XP II, including PCI Express (PCIe) target interfaces using both verification acceleration and in-circuit emulation. Using hardware abstraction layers, the software couldn’t tell in which environment it was running—emulation or real silicon. The actual software that customers would use once full hardware was available was actually running over a year before first shipment.

Eating our own dog food really helped. Emulating Palladium Z1 on Palladium XP significantly contributed to first-time-right silicon. Palladium has great momentum as confirmed in our earnings calls and by analysts. And I am looking forward how using Palladium emulation at Fujitsu will help develop the next-generation supercomputing platform!

See you at ARM TechCon!