Big Data Meets Chip Design

As the volume of data grows, companies are looking at what else they can do with that data.

popularity

The amount of data being handled in chip design is growing significantly at each new node, prompting chipmakers to begin using some of the same concepts, technologies and algorithms used in data centers at companies such as Google, Facebook and GE.

While the total data sizes in chip design are still relatively small compared with cloud operations—terabytes per year versus petabytes and exabytes—it’s too much to sort through using existing equipment and approaches.

“You can take many big data approaches to handle this, but there may be a business problem if you do,” said Leon Stok, vice president of EDA at IBM. He said EDA doesn’t have the kind of concentrated volume necessary to drive these kinds of techniques, and typically that problem is made worse because the data is often different between design and manufacturing.

Server room interior in datacenter

But for those working on designs, the amount has grown significantly at a time when extracting key data in various parts of the design flow is crucial. For one thing, data from different domains such as electrical, thermal and timing historically have been siloed, said Tobias Bjerregaard, CEO for Teklatech. “One has been modeled and analyzed by making assumptions and margining of the others. Hence multi-physics and cross-domain effects have not been handled, or even understood, effectively.”

Second, he said, SoC data has been divided into manageable blocks in hierarchical design approaches. The amount of data in a state-of-the-art SoC is large enough that it no longer will fit on a single machine. “The result is that ICs are being over-designed. Margin in terms of both area, power and performance is being left on the table. By employing big data methods, it will be possible for designers to leverage an understanding of cross-block/cross-domain issues, and recoup the margin.”

Vic Kulkarni, senior vice president and general manager of the RTL Power Business Unit at Ansys, agrees. “The typical complexity we see may be a 20 billion transistor SoC with an RC network of 100 billion nodes with all the connected dots and connected nodes, so energy efficiency and power management and power integrity has become rather interesting in terms of how you look at the margins.”

Take the 10nm technology node, he said. “10nm will have 500 millivolts supply voltage, and the typical margin is 15%, so let’s say 75 millivolts. The designer knows they have to meet this margin for dynamic voltage drop or their power budget or power management number, so they get involved in managing the margin. That 75 millivolts gets distributed to the chip — maybe 10% there, 15% in the package, 25% somewhere else. They will allocate that margin, that particular small milli-voltage they have available, because everything will have impact. So the chip-package-system story becomes more important in terms of IoT because now one effect like power can impact thermal, can impact voltage-aware timing, and can impact performance of the whole chip, package and system. The package may degrade in its performance if you don’t do a proper decap optimization. You will get signal bouncing. You will also transfer the problem from the chip level, and to the package and the system, subsequently.”

Kulkarni stressed this is a key aspect of the total IoT design effort, given that IoT sensor nodes are getting more and more complex. “There is more edge processing happening on the sensor nodes, so a lot of processing has to be done in any vertical segment, for that matter—industrial IoT 4.0 for industrial applications, to autonomous vehicles, to healthcare and to drones, to name a few. All of them require highly energy-efficient end devices to start with, so big data [concepts] will allow what-if scenarios, and allow the engineering team to move away from margin management towards meeting design specifications.”

This approach basically flips the design process. Rather than over-designing the chip, to make sure it doesn’t exceed the power budget, the starting point is underdesign.

“If you are doing margins, people tend to do overdesign of the power grid, the die size increases, and then the PPA metric gets out of whack,” Kulkarni said. “By looking at it from a holistic point of view, you start with underdesign, then you do surgical insertion of metal 2 vias, and you can get between 5% and 10% die reduction.”

This isn’t a simple transfer of big data methodology, though. As Bjerregaard pointed out, big data methods generally are deployed for the analysis of static data. Think of Google Maps, for example. It does not include what-if analysis capabilities to determine how long it would take to drive from point A to point B if a new road was added from point C to point D, or if the road from point E to point F was upgraded to a freeway—or, in the case of chip design, completely removed.

“This means that the currently known big data methods are not necessarily suited for what-if analysis, let alone automated optimization,” Bjerregaard said. “The exception is for heuristics-based optimization—which is assumed, based on the designer’s know-how, to lead to better design performance. In this context, big data analysis could greatly assist the designer to gain an understanding of the design, leading to better heuristics.”

Still, Kulkarni maintains that the concept of only putting in exactly what is needed was not possible if the design approach is “in the box,” which adds margin. “Out-of-the-box thinking is ‘meet design goals, not margin goals,’ and that is not possible unless you have big data, because what is possible now is to do many experiments, rapidly, in a big data elastic compute architecture where the various functions become apps. EMI becomes an app. Then, because the infrastructure exists, extraction becomes an app. You can have place and route information, so-called heterogeneous information, absorbed in this architecture. You can run many simulations and experiments to see what happens with this die size versus the other die size.”

And while engineering teams regularly over-design to minimize risk, it wastes a lot of area. Applied to millions of devices in a sensor network, which includes image processing, CPUs and RF, die size costs can become prohibitive.

Technical impacts
There are other times when understanding data is critical in a different way. Consider antennas, for example. Hot spots create EMI radiation in the chip and the surroundings.

“In RTL there are certain clock gating transitions which can create downstream EMI events,” said Kulkarni. “They already create voltage drop events. To do that, you need very large, big data on a server farm, which can analyze the impact of those rather quickly, and then perform what-if analytics, which is not possible in today’s use model. Metaphorically, we can look at the Audi e-bike, the KTM X-Bow sports car, and Google’s Tensor Processing Unit to see what is common in these.”

Connected to this, the hierarchy of data analytics is important to understand, he said. At the lowest level are the self-explanatory descriptive analytics that describe a condition/failure. Next are predictive analytics, which define the condition of a failure. If that hotspot continues on a power line or signal line, for example, it will fail. But creating reliability models help to avoid this. At the top of the analytics stack are the prescriptive analytics, which gets to the crux of the need for big data to tell how to prevent the failures for any future designs with machine learning. Machine learning can give you the ability to do thousands of experiments with correlations that are automatic with the experience.”

In the chip world, big data is still a relatively new concept, but that is changing.

Susan Peterson, group director for product marketing of verification IP and memory models at Cadence, said the company is looking into newer verification methodologies that could touch on big data. “One of the first ones we are looking at is in the memory domain where we started to think, ‘What we do is functional verification, but does it work according to the spec? What if it works according to the spec but it’s super slow?’ Or, ‘What if it works according to the spec but it’s a power hog?’ Then you haven’t really met your design goals.”

To this end, she said later this year the company will disclose some, “very unique products — particularly in the memory domain — that will help an engineer not only figure out if something is functionally correct, but how to balance power and performance, particularly in memory, so that you are really looking at the whole picture not just the functionality.”

She acknowledged this will leverage different engines that use advanced big data type algorithms to analyze the data coming out of those engines.

At the same time, Frank Schirrmeister, senior group director for product management in the System & Verification Group at Cadence, believes that levering big data in the design tool industry is interesting, and that there is certainly a need for it. But he adds there is still much work to be done. “The need comes from us creating much more data in technologies like Emulation and so forth than a user can reasonably digest. We have so much data to deal with that big data concepts definitely kick in.”

One area of development could be in machine learning in the context of debug, he said. “People have so many runs of data, and so many things to compare, it’s not possible to look at all of those manually. On top if it, you have a situation in which the process is limited. People often know what they are looking for, but the problem becomes, ‘What are the things they didn’t know about?’ What we are looking for are techniques to help users identify the things they weren’t aware of, like, ‘Have you looked at this?’ Or, ‘Are you aware that all of this is happening at the same time?’ Those are the types of things where we are looking for intelligence to help users to assess what’s really going on, and to find things they were not otherwise able to find. Just being able to deal with all the data you find is in itself a big issue, because you can’t just store it all away and hope you find something. You need to be much smarter about it.”

Big data methods can bring a lot of good stuff to the table. But a layer on top of the big data structures that implements a boiled down view of a design—which in turn enables incremental analysis and fast design updates—is required to fully exploit that potential, Teklatech’s Bjerregaard concluded.

Related Stories
Convolutional Neural Networks Power Ahead
What Cognitive Computing Means For Chip Design
Plotting The Next Semiconductor Road Map