Domain-Specific Design Drives EDA Changes

Number of options grows, but so does uncertainty about whether these designs will work and how much they will ultimately cost.

popularity

The chip design ecosystem is beginning to pivot toward domain-specific architectures, setting off a scramble among tools vendors to simplify and optimize existing tools and methodologies.

The move reflects a sharp slowdown in Moore’s Law scaling as the best approach for improving performance and reducing power. In its place, chipmakers — which now includes systems companies — are pushing for architectures that are optimized for markets such as hyperscale computing, automotive, mobile, communications, aerospace/defense, industrial, and medical applications.

“The most advanced companies are designing ICs in-house for their applications,” observed Joe Sawicki, executive vice president for IC EDA at Siemens EDA. “Much of that IC methodology is the same process, but what’s changing is that if you design the whole system, you have new options to tie IC, package, and board design together to create even greater differentiation and value.”

Much of this was triggered by a slowdown in device scaling, but it has since taken on a life of its own.

“There’s industry-wide awareness that Dennard scaling stopped a while ago and Moore’s Law has pretty much ground to an end,” said Rupert Baines, chief marketing officer at Codasip. “You can still get more transistors, just about, but we’re looking at the end of that. In terms of lower cost per transistor, we went past that a while ago. And yet there is a huge opportunity for architectural and algorithm optimization, with improved performance at the same node. There are lots of proof points for this. A very long time ago, we used to do DSP functions in general-purpose logic, and then in general instructions. Then we started getting specialized instructions in DSPs and co-processes and accelerators. Twenty years ago, the same thing happened with graphics, and there was a time when graphics was done in software. If you had a Sinclair Spectrum or Commodore PET or Apple 2 or anything like that, the graphics were all done in software. Now, everything is done in dedicated hardware. Nvidia and Arm and Imagination are doing incredibly sophisticated things in optimized and specialized processors, and we’re seeing that same idea applied to AI, just more generally.”

A case in point is Apple’s M1 chip. “They’ve achieved amazing things by taking fairly conventional hardware and optimizing it for the software they care about. That’s doubling performance, doubling battery life,” Baines said. “Apple shows what you can do by tuning architectures for applications, and there’s absolutely no reason that won’t become mainstream.”

Rethinking business models
Hyperscalers have added an interesting twist on all of this, funding hardware innovation with their software compute services. “This has never happened in past,” said Priyank Shukla, senior staff product manager for Interface IP at Synopsys. “Architectures were defined by one processor company, Intel, which owned the whole ecosystem. They wanted a generic architecture they could provide to their customers. But now, with all the activity in the hyperscale data center arena, those companies can charge their customers in their compute facilities by the second. And they can design their own hardware. This means that for the first time, hardware gets innovation dollars from a service. That’s why they’re investing in new architectures wherever they are needed.”

New architectures that were not available 10 or 15 years ago are shaking things up.

“For example, general matrix multiplication, which does AI processor much better than x86 or any other architecture, allows new tradeoffs,” Shukla said. “If I have a workload to process, do I need just an x86 kind of architecture to process it? Today, I have many other silicon-proven architectures in my inventory. How do I split the system? Do I have a big farm with just matrix multiplication that the GPU does, or do I give a part of that farm to your phone or device and enable the inferencing on the edge? All these options are available today, and there are R&D dollars funding it. That’s why we are seeing more domain-specific, rather than one-size-fits-all implementations.”


Fig. 1: Some options for domain-specific applications. Source: Synopsys

Specialized tools for specific applications?
To understand the current dynamics in play, visualize a matrix of vertical industries and horizontal technologies.

“The vertical industries are comprised of aerospace/defense, hyperscale, mobile, communications, automotive, consumer, industrial and health care, which drive requirements like safety, security, how much I need things like digital twins and digital engineering, and what I can do with the data and how I transmit it it,” said Frank Schirrmeister, senior group director for Solutions & Ecosystem at Cadence. “Other technology requirements can include AI/ML enablement, low power/thermal, chiplet/3D-IC, multi-domain simulation, mixed signal, embedded software development, photonics, and RF microwave.”

Some of these technologies overlay all industries, while others are dominant in some sectors and not others. But this specialization also suggests that specialized development tools for vertical segments may be necessary, although exactly where and how is still an evolving issue.

“Some markets, like automotive, have their own very particular, very special requirements,” said Baines. “And if you want to service that market you do need to understand their vocabulary and their standards and the way they work. You’ve also got particular algorithms with DSPs or AI. Doing a good job in those worlds requires understanding the specific software libraries and tool chains. If you’re going to do AI, you need to support CUDA and OpenGL and things like that. We’re going to see that level of optimization applied, and that level of understanding into other different things.”

The challenge is that EDA tool suppliers need to enable customers to do those things, but in a lot of cases the customers are the real experts. In Apple’s case, for example, the company knew what it wanted, but it was looking for a more efficient way of doing it.

How much of this requires specialization in the tools, versus the ability to explore a variety of different architectural options, isn’t entirely clear. “One domain-specific architecture is die-to-die disaggregation,” said Shukla. “This is addressing the constant theme across networking, in 5G base stations, but then also in HPC. In the quest for more aggregated computing per square micrometer, disaggregation is occurring. In one package, there are multiple die, and the architecture could be divided into homogeneous die and heterogeneous die.”

Homogeneous dies typically are defined as being developed in the same process node, although sometimes definitions extend beyond that. However, whether homogeneous or heterogenous, each can bring its own set of challenges when packaged with other die.

“Previously, we were designing one complex die, but now the systems consist of multiple dies in a package,” Shukla said. “You need to do the system partitioning. What do you want to have on this chip, this die? What do you want to have on that die? Where would they be placed? How would they be connected? Would you be able to close timing? What does the power delivery network look like for this whole package? Those are the new challenges for this kind of domain-specific architecture, which is homogeneous disaggregation. Another option is split SoCs or split dies that are connected for some functionality. This is not a mesh network. There are different challenges, and you have to take care of timing issues, because these dies are interconnected. While it may appear simpler, it nevertheless poses new design challenges.”

New technologies, such as 5G, add other challenges. Because of rapid signal attenuation, particularly with millimeter-wave, many base stations are required. This translates into more antennas, more base stations, and new standardization so that new players can come in.

“Microsoft has invested a lot on 5G networking,” Shukla said. “They are coming into the place which was served by Nokia and Ericsson, and when these hyperscalers come in, they want to standardize so there are many new players. The standards previously with 4G RAN were proprietary interfaces. Now, hyperscalers are asking for new interoperable protocols such as Ethernet, even for 5G base stations. eCPRI is one that uses Ethernet, and this is leading new design starts in the 5G domain. 5G Open RAN also is leading design starts for base stations. Again, because you want to integrate as much as possible in the least area, we are seeing die disaggregation. The theme remains that you need to disaggregate die in domain-specific architectures used for both HPC as well as 5G.”

All these changes mean the EDA tools must also accommodate new design needs.

“In the past, EDA tools solved on a die,” he said. “Now, EDA tools have to solve on a package because there are have multiple die. Floor-planning we used to do for a die. Now we have to not only do floor-planning on a die, we have to do it for a package. Bump placement, auto routing — all these things that were done at die level need to be done at the package level, along with die-to-die routing, floor-planning, sign-off analysis for electromigration, power and network analysis.”

Optimization is king
With ever-more complicated chips, engineering teams are faced with more and more data — far more than the human brain can process and store. But all of that data is needed to effectively verify a chip will work as expected, and to debug it when a problem arises.

“There is no way to make conclusions that easily grasp all the information that is valuable,” said Olivera Stojanovic, project manager at Vtool. “We are trying to help users to filter all the necessary information and focus only on debug. Machine learning also helps when there is a huge amount of data, and you need to figure out if there are relationships within it — as well as when an error is detected to help the engineering team understand where the issue is — but also to grasp the main focus for doing the debug. In verification, especially when using third party VIPs, it’s not your code. You can’t control it. There is a bunch of information and it is not easy to control. New approaches are making it possible to determine what is the valid information when there are different providers and different combinations of code in the verification environment. For example, if you have some code that is running on a CPU, that is C code. On the other side you have a UVM environment. There are both messages as an output of the execution of the code, along with the UVM messages. If the log files can be merged and looked at it with the same tool, it would be very helpful for the verification engineers.”

“The latest advances in debug center around a mix of techniques, such as hardware trace, scan chain-based debug, and so on,” said Shubhodeep Roy Choudhury, co-founder of Valtrix Systems. “Tracing and capturing the internal signals based on trigger events, and storing them in a trace buffer, which can then be read from a debug port, allows seamless collection of data without disrupting the normal execution of the system. A lot of hardware instrumentation may be required, but placing the trigger events near the point of failure can yield a lot of visibility into the issue.

This is particularly important in domain-specific designs and in advanced packaging, where there is limited history of what works and what doesn’t, and where there are a number of unique variables.

“Since late bugs are always a risk to the schedule, in addition to debug techniques, a high level of emphasis also should be put on the stimulus and test generators,” Roy Choudhury said. “Enabling software-driven stimulus and real-world use cases early in the design lifecycle increases the chances of hitting the complex bugs. Also, application software is not developed with the mindset of finding design bugs, and is often complex to debug, so using stimulus generators that can exercise the system better and utilize the debug infrastructure present in the system is always a good idea.”

Alternative solutions
Being competitive in specific markets is all about optimizing a design for a particular application and use case, but it requires a deep understanding of the chip, the software, the package, and what needs to be prioritized for a particular market or use case.

“We are optimizing and optimizing,” said Aleksandar Mijatovic, formerly the digital design manager at Vtool. “Demands are rising. We want smaller and faster. But the price of manufacturing is going up. The good news is the amount of over-designing will start to shrink a bit. The price will not be worth it as much as it was some years ago. We are approaching the point where we are not going to do breakthroughs through technology. We know that technology in silicon is slowing down, so instead we are going to turn to optimizing architectures, and try to speed up chip production. With the size of chips we are seeing now, entropy is becoming huge. You would need to bring all relevant data to one place and somehow compare it, because at a certain point, you are starting to get lost in all outputs and all the debug data in the complete flow as the order of magnitude of the problem changes. It starts getting out of focus simply because your capacity to remember everything is dropping. Also, you do not have time to learn the entire system before the deadline.”

As a result, foundries have moved toward proven sorts of structures, suggesting architectures that are known to work. This limits the number of options, which may be too expensive even though they are technically possible.

“Foundries will tell you, ‘In this technology, this will give you this performance and this yield,'” Stojanovic said. “It’s your choice. You can choose something that is not optimal, but you’re going to pay the area, manufacturing, and the bad performance.”

These tradeoffs of performance, power, area/cost/development time are not getting any easier to figure out because there are so many ways to do things.

What do designers want?
User communities know and understand the algorithms used for a particular market. The challenge is developing hardware that optimizes those algorithms for their specific applications.

“They’re looking for a tool that can help them automate that or make it more efficient,” said Codasip’s Baines. “They’re also looking for a tool that supports verification.”

Verification has always been a huge challenge, but it can be even more difficult, time-consuming, and expensive with unique designs.

“When you start talking about processes, with all the complexity they’ve got, and with the incredibly difficult use case of lots of arbitrary instructions, verification becomes a really huge challenge,” Baines said. “To make a design easier for people that is also robust and that they can trust as much as possible, the tools have to do the work for you. They should generate scorecards and testbenches. The tools should generate UVM. The tools have to be consistent and compatible with standard flows. It’s really saying they have to be some sort of a preprocessor that generates Verilog or SystemVerilog in a way that is consistent and compatible, because you don’t want to have to reinvent everything. You’re looking for acceleration. You’re looking for a front-end tool that makes everything else easy and consistent with everything else. That, in turn, means consistency and compatibility must be human-readable. There have been tools in the past that spit out something very elegant, but it’s sort of obfuscated and encrypted. That makes them useless.”

Synopsys’ Shukla’s contends that EDA companies now have to up their game, helping customers with everything from system planning, to implementation, and firmware/hardware/software co-development.

To drive technology requirements in specific markets, one approach is to leverage industry standards. “One of the things that happens with safety and security, for example, is each industry seems to have related but slightly different standards for things,” Schirrmeister said. “There are DO standards in aerospace/defense. For safety, there is ISO 26262. Then, overlaying standards [in the United States] is the National Highway Traffic Safety Administration (NHTSA), with different organizations in Europe. So all of this is regionally dependent. That means there’s really no simple answer for how to think about this.”

What it does do is encourage teams that hadn’t interacted previously, to now do so. This also is reflected in more of a focus by the EDA tool community on creating solutions and flows of tools that have an interconnected foundation.

Conclusion
At the end of the day, the big debate is how to lead with targeted solutions versus generalized products, particularly as these products become increasingly integrated, and what is the best way to develop those solutions.

“The products are key, but you can’t make up for a tool that doesn’t work quite well by simply saying, ‘Oh, it’s all nicely integrated into a bigger thing,'” Schirrmeister said. “If you have a weak link somewhere, a very thin column of sorts, the whole building will collapse. But it also fosters bringing in more of the ecosystem so that when it comes to things like security, it’s really an ecosystem thing. That’s why in a lot of these flows today, when it comes to something like safety and security, the full stack, you want to work with somebody like Green Hills or Tortuga Logic or Dover Microsystems. It’s the same with low power. You bring in the technology partners, and there it’s more about the semiconductor technology ecosystem. You bring those all together and it becomes an ecosystem problem to solve, and you work together to solve it.”



Leave a Reply


(Note: This name will be displayed publicly)