Low Power-High Performance

The Growing Integration Challenge

As the amount of IP and other functionality increases, so does the need to reduce margin in designs. This is easier said than done.

November 8th, 2012 - By: Ed Sperling

By Ed Sperling
As the number of processors and the amount of memory and IP on a chip continues to skyrocket, so does the challenge for integrating all of this stuff on a single die—or even multiple dies in the same package.

There are a number of reasons why it’s getting more difficult to make all of these IP blocks work together. First of all, nothing ever stands still in design. As a result, what works once doesn’t necessarily work the next time. There are too many components to effectively control in an SoC ecosystem, and updating what essentially are black boxes can have unexpected consequences even with the best-characterized IP.

The rule of thumb in the IP world is that if you’ve used it before, it should work. But that’s not necessarily true. Nothing is static, which means even derivative designs using the same process technology have to be checked out thoroughly. And tweaking one thing may have an effect on something else.

“There’s a common theme running through design these days that the reliability of the device is becoming a bigger and bigger problem,” said Aveek Sarkar, vice president of product engineering and support at Apache Design. “If you lower the supply voltage, noise increases. But if you lower noise, then that creates more errors, too.”

One of the most common approaches to solving these problems has been to put extra logic on the chip. But that extra margin carries a steep price in terms of power and performance, particularly at advanced process nodes and in densely packed designs. Foundries have been pushing designers to reduce the margin so chips can meet power and performance budgets, using finFET transistors to control leakage at 14nm, higher-k gate oxides, and fully depleted silicon on insulator at 20nm and beyond.

“We have to control margin in the design process,” said Richard Trihy, director of design enablement at GlobalFoundries. “As the nodes advance, that’s becoming a harder and harder problem—how to claw back the margin from designers when they are trying to meet frequency.”

Different approaches
Lowering margin, along with many other low-power approaches, requires some architectural and methodology changes. Nowhere is that more evident than in the integration side of things. While more IP makes it quicker to assemble a design, that design has to be able to deal with different versions of IP blocks as well as IP from multiple vendors that may not be characterized using the same language.

The trick is to be able to create accurate enough models so that guard-banding isn’t necessary—or at least to use less guard-banding. But designers generally look for the outer limits that can affect a device’s reliability and work backward from there.

“What we’re seeing is a lot of worst-case power numbers being proposed for SoCs,” said Ghislain Kaiser, CEO of Docea Power. “A lot of this is based on unrealistic use cases. You don’t need to plan for every possible case. What’s required is dynamic analysis to simulate realistic use cases, so you move from more margin to less margin.”

The solution involves different ways of looking at a design. Rather than building everything from the bottom up, with very detailed measurements about power, heat and electromigration, there needs to be equal attention paid to how it looks from the top down. ESL models can help significantly there, but only if those models are constantly adjusted to deal with variations at the RTL level.

This is particularly important because partitioning needs to be strategic to accommodate future derivatives without having to worry about a whole new round of worst-case scenarios. In stacked die, for instance, IP may be an entire die. It’s entirely unknown what that die will be situated next to when it’s created. Adding margin is one way of dealing with this issue, but if that slows down the performance or increases the power to unacceptable limits, then other solutions need to be found.

“The basic challenge is to understand how the integration challenge is changing,” said Chris Rowen, CTO at Tensilica. “The pin interfaces are probably the best understood part of the design. We’ve got hardware standards like buses to define the pins, and if they’re bus compatible then the pins mostly hook up. That’s not true with other parts of a design. So with models there are some very good standards like SystemC, but there also are many representations of SystemC. You can’t necessarily take two SystemC models and expect them to work together.”

Rowen noted that the problem is magnified as the number of facets of the design increase. “There is a huge cone of complexity as you go forward. You have certain expectations as subsystems come into play, such as whether it works and puts an acceptable load on the bus bandwidth.”

If that load is too high, it can impact latency in the overall design. But even if it all works properly, one subsystem may interact with other subsystems. “In a lot of cases there is a strong correlation between subsystems,” he said. “You don’t run want to run a video decoder and a graphics subsystem at the same time because your battery would last 20 minutes. But there’s no way to prove that what’s being covered is the worst case scenario.”

Software
One of the best alternatives for dealing with that kind of complexity is to handle it in software. Software can be written to deal with a variety of very complex hardware-related issues, such as turning on and off blocks quickly and maintaining an acceptable level of dark silicon. But software itself also needs to be written with an eye toward power and energy efficiency rather than just performance. Unlike hardware, where turning down the clock frequency results in lower power, just running less code doesn’t necessarily achieve the same results.

The key is getting more done in fewer cycles and then shutting down quickly, or using less-powerful processor cores to run software in the background. This rationalized use of resources has been talked about for years, but so far it has been restricted to advanced smartphone designs, where hardware and software teams tend to collaborate more closely.

“If your noise is high you can design for that—or have the software deal with it,” said Apache’s Sarkar. “And that’s a question that’s being asked a lot these days: Does the chip need to do it in hardware, or can the software do it?”

This problem is magnified in a multicore design, where adding additional cores doesn’t necessarily equate to a proportional increase in performance or a reduction in power.

“A single core may only give you 65% of the performance boost,” said Anil Khanna, senior product marketing manager at Mentor Graphics. “It’s typically a debugging problem, and debugging hasn’t evolved to deal with multiple cores. People are used to doing manual debugging for one core, and that doesn’t cut it anymore. With multicore we have functional and performance optimization challenges.”

Understanding IP better
Part of the solution also is to understand IP better. If a design is 90% re-used IP, and as much as 70% of that is derived from other vendors, there needs to be some standard way of characterizing it. The IP-XACT standard is one such mechanism. So far its adoption has been limited, but work is under way by Accellera (which acquired SPIRIT in 2009) to make it more robust.

Separately, the major foundries have taken steps toward at least qualifying IP for manufacturability. GlobalFoundries now qualifies IP, and TSMC has just expanded its IP Alliance to include soft IP from companies such as ARM, Cadence, Sonics, Synopsys, Imagination Technology. In addition, TSMC is working with Atrenta to help test the reliability of all the IP in its approved portfolio.

“That’s an interesting relationship between TSMC and Atrenta and its Spyglass tools,” said Mark Throndson, director of product marketing at MIPS. “From TSMC’s view, you need to put more effort into the quality of the IP ecosystem. We’ve always had rigorous testing, but this is the first time we’re seeing this from the foundries.”

The reason is that while the large, established IP vendors can devote resources to characterizing IP, the smaller IP vendors don’t have as many customer engagements and as much feedback about what can go wrong in an implementation.

“The question is how you pick IP and integrate it,” said Mike Gianfagna, vice president of marketing at Atrenta. “What you need to show is how good it is. There is no such thing as perfect IP, but if you know where the problems are then you’re way ahead. Every piece of IP has some problems.”

Along similar lines, the IEEE P1687 standard addresses test and debug of IP. Steve Pateras, product marketing director at Mentor, said the goal is to do test and debug IP more efficiently while decreasing integration time, utilizing a combination of the Instrument Connection Language and the Pattern Description Language. The company’s first product to leverage this standard was introduced this week for P1687-compliant IP blocks.

Ed Sperling

(all posts)
Ed Sperling is the editor in chief of Semiconductor Engineering.

The Growing Integration Challenge

Ed Sperling

Leave a Reply Cancel reply

Technical Papers

Knowledge Centers
Entities, people and technologies explored

Related Articles

RISC-V’s Increasing Influence

Development Flows For Chiplets

New Data Center Protocols Tackle AI

Chiplet Tradeoffs And Limitations

Implementing AI Activation Functions

Die-to-die Interconnect Standards In Flux

The Best DRAMs For Artificial Intelligence

Future-proofing AI Models

Sponsors

Recent Comments

About

Navigation

Connect With Us

The Growing Integration Challenge

Ed Sperling

Leave a Reply Cancel reply

Technical Papers

Knowledge Centers Entities, people and technologies explored

Related Articles

RISC-V’s Increasing Influence

Development Flows For Chiplets

New Data Center Protocols Tackle AI

Chiplet Tradeoffs And Limitations

Implementing AI Activation Functions

Die-to-die Interconnect Standards In Flux

The Best DRAMs For Artificial Intelligence

Future-proofing AI Models

Sponsors

Newsletter Signup

Popular Tags

Recent Comments

About

Navigation

Connect With Us

Knowledge Centers
Entities, people and technologies explored