Containing Design Complexity With POP IP

How to deal with complex power issues at advanced nodes.


About 25 years ago, Carver Mead, one of the pioneers of VLSI design, told a technical audience then grappling with the complexities of quarter-micron design that he could see an evolutionary path to about 130nm, but after that point, the picture blurred.

Flash forward to the present and we’re manufacturing SoCs at 7nm, and the output is truly amazing devices powering applications we and Mead couldn’t imagine back then.

But the device sophistication demanded by the market means increased complexity in implementing those mind-blowing designs. At 7nm and 5nm, the traditional implementation challenges around things such as thermal issues, power dissipation and timing are elevated and amplified in often painful ways.

In some cases, the source of the challenge is physical. For example, at 7nm, the transistors are about 30 atoms across, and the dielectric layers are just 1 to 2 molecules thick. With those constraints, there’s not much room for error in either design or manufacturing.

For example, if you thought IR drop (increasing resistance thanks to decreasing dimensions, as Brian Bailey writes elsewhere on this site) was a problem before, it’s even trickier now.

Devil’s in the details
Traditional power grids need to be altered so they work with the architecture of the standard cell library. For logic, the challenge at 7/5nm is to properly manage the interaction between the standard cells and the power grid. The days where you could build a power grid without considering the standard cells are over. The architecture of the standard cells must fit with the power grid implementation; therefore, the power grid must be selected based on the logic architecture. This interaction must be coupled with a tight enough grid to support the IR drop that you want. IR drop and electromigration issues will be almost impossible to resolve if this interaction and your specification have not been properly accounted for from the beginning.

If you thought placement was restrictive before, now it’s more complex and time-consuming. There are things you don’t even see in your place and route tools because they’re at layers that aren’t modeled in the abstract views used within the tools. You don’t see them until you’re close to tape out, running final DRC. These layers have an impact in manufacturing, but they haven’t been modeled or accounted for in standard SoC implementation flows. Suddenly they have these impacts, but you don’t see them until the end. This can result in pre-tapeout iterations, and project timelines should plan for this.

To illustrate the increase of complexity, consider that at 28nm, a LEF technology file (Library Exchange Format, the model for routing layers) has roughly 70 lines. At the 16/14nm node, a LEF has twice as many, and at 7nm, this increases to about 230 lines.

To offer another example, consider the implementation of an Arm Cortex-A76 CPU, our company’s newly release processor. The Cortex-A76 takes about 8-10 hours to synthesize with four CPUs, and any testing takes at least that long to see the results. Place and route can take days. One evaluation we’ve seen has just the placement-optimization stage taking more than 30 hours, and the clock-optimization stage alone taking another 30 hours. All of this, of course, depends on the design, its constraints and the number of CPUs being used for the implementation run.

Time-to-market pressures
All this complexity means time-to-market windows for companies can be much more challenging to hit. For design teams working on those mind-blowing designs at 7nm, this is unacceptable.

Arm’s answer to these vexing issues has been its successful 8-year-old POP IP program. Arm POP IP is the bridge between Cortex-A CPU and silicon process technology. An optimized Artisan physical IP solution for the Cortex-A processor series, it offers a proven high-performance or high-density implementation solution, within a given power envelope, that helps lower technical and project risk.

POP IP enables multiple configurations, including low power to a high-performance, low-power combination with light sleep mode and high-performance fast cache instances. Tightly coupled processor core and POP IP product development give design teams the flexibility to optimize for maximum performance, lowest power, or any combination in between.

POP IP is always optimized for a particular process node and market segment (premium mobile in the case of the Cortex-A76 POP IP). So, each POP IP delivers unique benefits. The microarchitecture of Cortex-A76 and the DynamIQ technology are new to many design teams, and the advanced process nodes present new implementation challenges. POP IP helps ensure that the increased productivity of the new CPUs is echoed in the productivity of your implementation teams.

Arm POP IP already enables laptop-class performance in 7nm SoCs based on Cortex-A76 in that it gives designers the opportunity to push the clock speed past 3 GHz and up to 3.3 GHz within power envelopes around half that of today’s mass-market x86 processors. POP IP adds value in four key ways:

  1. Accelerates time to market
  2. Lowers the risk of implementation
  3. Provides predictable PPA
  4. Offers flexibility for the processor configuration, physical IP and implementation methodology

POP IP solves implementation challenges at advanced geometries before design teams experience them. Earlier, I alluded to the increasingly strict design rules that come with each new process node. With 7nm, there are additional placement rules, a new style for VIA connection – VIA ladders – and the idea of explicit coloring requiring updates to the physical IP and the design tools. EUV processes will have their own challenges that impact the physical IP and the tools, and POP IP continues to provide solutions.

Overcoming implementation challenges
One unique feature of POP IP is completely behind the scenes. Internal to Arm, we are able to begin early processor trials with the physical IP and the POP IP methodology concurrently for new cores. This begins an iterative flow that also benefits from the collaboration between Arm, EDA vendors, and the foundry process teams. This portion of the ecosystem and our joint efforts result in refinement of the core, the Artisan physical IP, the SoC implementation methodology, and the process itself.

The result of development and implementation learning internal to Arm is delivered to design teams as part of the POP IP contents: in the user guide and reference scripts supporting the final implementation.

When Mead offered his process-evolution forecast a quarter century ago, consumers were buying cordless and very early mobile phones, digital cameras, laptops, pagers and so forth. Since then the march of IP-based design has continued unabated, driving through 180nm to 28, 16/14 and now 7nm. In part, that has helped us integrate all those discrete devices into one: The smart phone. At the same time, the relentless march of innovation and design enablement has created markets like the Internet of Things, autonomous vehicles and AI/ML-based applications that are just now coming on the scene.

POP IP helps address the implementation challenges for new cores, new processes and new implementation methodologies. It can translate the substantial gains in CPU compute onto silicon and make designs more optimized and efficient in terms of compute cycles, while cutting time-to-market challenges that would come if your design team took on the SoC all alone.

For more information about Arm POP IP, click here.

Leave a Reply

(Note: This name will be displayed publicly)