The Upside Of Dark Silicon

Analysis: At 20nm there is so much real estate available on SoCs that just figuring out what to do with it is a problem; don’t expect it to last.

popularity

By Ed Sperling
For many years the real challenge in IC design was in shrinking the components and features on a piece of silicon without burning up the chip or destroying signal integrity.

Chipmakers have become quite adept at this over the past few decades. Too good, in fact. Now they are faced with a different kind of problem—what to do with all that extra silicon. Just as the long distance carriers laid too much fiber-optic cabling in the 1990s to be fully utilized—the so-called dark fiber that was never lit up with data traffic—chipmakers are struggling to find a use for all the available real estate that has become available at 28nm and beyond. At that process node, we have entered the realm of dark silicon. Discussions about power and performance continue, but area is often a secondary discussion.

In some respects this is a good problem to have. It makes the design process looser. Things can be added into a chip after the architectural stage, extra margin can be built in, and if there’s enough room on the design problems can even be fixed late in the process.

“There’s a lot more flexibility,” said Kurt Shuler, marketing director at Arteris. “But it still costs money. Every time in the past we have said there is enough area and area is free, we’ve found out it’s not.”

Superchips
So what should be done with all this area, if it’s not just to close up design flaws? One idea that has been utilized by major chipmakers is the superchip concept. Building in all the future possible configurations that are known when a chip is created and making certain pieces of the chip active is one way of reducing the cost of derivative chips.

This is particularly attractive for the network on chip, memory, and IP companies. In the case of NoC technology vendors, creating a flexible network for these different devices is essential. In the case of memory and IP companies, purchases and licenses are dependent on what gets put on a chip and shipped rather than what actually gets used. Some chipmakers have built in complex power-saving techniques, for example, that have been completely ignored by customers. That’s a waste of NRE resources as well as silicon, and in a high-volume production chip it can cost a lot of money. But at the same time it may have little effect on whether the chip actually functions well.

“There have been a lot of advances in SoCs in areas like power management that never get turned on,” said Drew Wingard, CTO at Sonics. “People spend a lot of energy sometimes for no reason. There’s a lot of dark hardware.”

This is particularly true in multi-core and many-core designs because many applications cannot take advantage of multiple cores at once. While more code is being written to take advantage of multiple cores, that isn’t so easy with some applications. As a result, some cores remain unused, usually powered down or completely off.

Some companies are pairing specific applications and functions with specific cores, or virtualizing them to utilize whichever core is available. The example Intel has used for years is one core in the background running malware checks—something that’s particularly interesting now that Intel owns McAfee—so another application’s performance is unaffected.

Industry sources say manufacturers are developing devices that can operate independently for office or home without risk of one infecting the other. Inside of corporations, being able to keep these worlds has become a security nightmare. By keeping these worlds separate and not allowing data to move across a partition it becomes harder for malware to infect a company without a physical way inside such as a USB device.

Physical effects buffers
One of the great advantages of excess silicon involves proximity effects. Power, noise and electromagnetic interference, among others, are exacerbated by feature shrinkage. In a tight space, this can create all sorts of havoc. A noisy SerDes or I/O can disrupt an analog signal.

“There’s also the notion of irregular behavior,” said Jack Harding, chairman and CEO of eSilicon. “We’re seeing thermal inversions where the chip will run faster when it’s hotter and slower when it’s cooler. That’s not intuitive. Physical effects were the theoretical limits, but we’re seeing non-linear behavior of digital CMOS.”

At 100 million gates for an SoC, or 1 billion gates for a digital processor, this is impossible for the human brain comprehend. But simply because of that, less of the chip is actually being programmed. In some cases that means a grossly sub-optimized chip. But in other cases it may mean the difference between getting the chip to work with a reasonable manufacturing yield and one that doesn’t work at all.

The future
While the business side may frown, this has been an almost free ride for the design world. But if this sounds too good to last, it probably is. As stacking of die begins, probably starting late next year, the familiar challenges and proximity effects will surface once again. Just as nature abhors a vacuum, so do semiconductor economics. Free space should be used to maximize performance and minimize power, and with dedicated memories, Wide I/O, heterogeneous cores and greater cache coherency on the horizon, the age of dark silicon could evaporate rather quickly.

There will still be a demand for more flexibility options, more memory—although more of it will likely be off-chip—and more demand for performance at lower power. And there will be a renewed effort to consider more options further up in the architectural planning and modeling stage. For any engineers who have gotten comfortable with the availability of back-wiring and extra room for problem solving, this may have been a brief respite as the industry begins rushing toward a slew of new technologies, new problems, and a renewed focus on better utilization of every last nanometer of silicon.