Less Margin, More Respins, And New Markets

How physics is reshaping the leading edge of design.

popularity

Semiconductor Engineering sat down to discuss the impact of multi-physics and new market applications on chip design with John Lee, general manager and vice president of ANSYS’ Semiconductor Business Unit; Simon Burke, distinguished engineer at Xilinx; Duane Boning, professor of electrical engineering and computer science at MIT; and Thomas Harms, director EDA/IP Alliance at Infineon. What follows are excerpts of that conversation.

SE: At the most advanced nodes, there are a lot of new issues involving heat, dynamic power density, resistance of electrons in wires, leakage current out of memory, noise, and electrostatic effects. How far can we continue to push this before we require fundamental changes in architectures, design and manufacturing?


(L-R) Duane Boning, Simon Burke, John Lee, Thomas Harms

Lee: I’m fundamentally optimistic about our ability to continue to design these chips. If you look at the process roadmap from our foundry partners like TSMC and Samsung, there’s a lot they’re doing to push the advanced process nodes. Then we also have 3D-IC. Every time in the last 20 years when we looked ahead at challenges over the next 5 years, we’ve always been able to solve them. What is clear is that you need to have physics-based simulation. With these advancements we no longer can afford margin. You can’t look at thermal as something separate from power, or as something separate from timing or power integrity. These are all coupled together. Margins are evil, and our job is to provide the tools to remove margin and design at advanced process nodes and across multiple systems and multiple dies. We can’t look at these as separate. We have to do something better than before.

Boning: There are places where we hit some limits, but what has always been impressive is that we find other ways to innovate around them. That has meant increasing complexity and increasing coupling, but fortunately the tools and the methodologies keep improving to address those. I don’t think we’re looking at a paradigm change where everything gets simpler. We are looking at a change where physical simulation, as well as other kinds of tools and methodologies, are emerging around those challenges. And machine learning and AI are coming along just in time to help us deal with some of the challenges.

Burke: We have immovable problems, and then we go on to the next immovable problems. Those solutions are essential to designers and the signoff process. A big challenge for us is to complete the design process and tapeout a chip on a schedule that meets market demand and is different from what we did before. Managing that complexity in the same time frame is a big problem.

Harms: At Infineon we are not developing at the leading edge. We are just moving to 22nm, but our products need to be secure and energy efficient. Automotive requirements make this a challenging environment. For us, it’s also about integration and the debugging of problems and how to fix them. There are different domains from an analytics point, and we need ways of quickly viewing and analyzing everything at the same time. That is independent of moving to the most advanced nodes.

SE: There is something fundamental that changed, though. In the past, we pretty much knew chips developed at the most advanced nodes would end up in smart phones or computers. Today, with AI in automotive and everywhere else, 5G, and the edge, these end applications are in transition, as well. How does that impact design?

Simon: Ten years ago we did programmable logic. Today we have Arm cores on there, along with DDR, hardened network interfaces, machine learning, a network on a chip, and there are also technologies added to our products because our customers demand it. That has expanded the use of our product, which gets us into these new spaces of 5G, automotive, machine learning and networking that we probably wouldn’t have been able to go after 10 years ago. Our products have changed to reflect that, but it hasn’t changed the underlying problems to get the chip out. It just means we have more problems to solve to get the same chip out.

Boning: This reminds me of conversations five years ago when it was all CPU chips and cell phones. It was all about, ‘What are the new products going to be? What will drive new technology?’ We wished for diversity and a lot of variety, and now we’re getting it. We have a great opportunity to rise to the challenge of creating multiple kinds of devices, with multiple kinds of physics, including photonics, as well as signoff on integration. It’s an exciting time to bring all of those threads together, but it also will be challenging.

Lee: The idea of general-purpose silicon certainly will continue. But we’re also seeing silicon for specific applications. With hyperscale companies and systems companies building their own chips for specific use cases, that all makes sense. And that ties into some of the megatrends we’re seeing, including autonomous driving, automotive in general, 5G communications, what’s happening in the data center cloud and AI. That is driving tremendous innovation.

Harms: We do a lot of design systems for the designer to develop a new product. What we see now are more advances in system-in-package and stacked dies. We need the combination of different domains and timing across the substrate between chips. Heat, reliability and cracking are issues. These are all things we are facing in the system now.

SE: Do we design the silicon for all use cases, or do we design the silicon for a limited number of use cases? And is this any different today than in the past?

Burke: I don’t think anything has fundamentally changed. But in FPGAs, we have to do specialized parts for different market segments. There isn’t one FPGA for everyone. So we do one for 5G, and another for automotive. That’s true in the ASIC world, as well, and I expect that fragmentation will continue. But because you’ve got multiple chips coming out in the market rather than just one, you’ve got multiple problems to solve.

SE: Isn’t this easier in the FPGA market, where you can reprogram the logic?

Burke: Yes, but there isn’t one FPGA anymore. There are 10 or 12, each with a specialized market.

Harms: If you look at power efficiency, you may actually add more use cases because you may have one, two or three areas switched off to save power. So rather than reducing the number of cases, we’re actually increasing them.

Boning: If you look at general-purpose versus specific-purpose chips, and look at the larger ecosystem, it’s a renaissance for what started out to be special-purpose chips or hardware. So signal processors turned into speech signal processors or vision processors, and now you’re starting to see those become more generalized. We’re seeing waves of innovation on very narrow use cases, and then we’re seeing commonalities being generalized. The same cycle is there, but there is now rapid change. This is happening now with AI chips, where they are going from very specialized designs to more generalized processors that can address a big family of problems.

SE: We’re also adding density in terms of stacking, and looking for patterns across multiple bits and pruning zeroes. What does that means for design going forward? Is this more of the same, or a new way of looking at things?

Burke: One of the interesting changes we’re seeing today is from scale-up to scale-out computing. A lot of the approaches used today require single systems, single servers, often with as much as you can fit in. But there’s a limit to what you can get and there’s a limit to what you can buy. The way forward is scale-out, where you have elastic compute in the cloud, and you add more servers if you want more capacity. That requires us to continue analyzing the size of the systems we’re building today. It’s an industry problem. But that transition is very important, and there is a fundamental change in the way EDA tools allow you to approach that problem.

Lee: The problem we have in EDA is that if you go back 30 years ago, the top computer scientists were focused in our area. Today, a lot of us are locked into the 1980s computer science, and that affects how our tools run — or don’t run — in a compute environment. We’ve been looking aggressively at the best computational science outside of the EDA simulation industry. There’s so much innovation happening on the computer science side for scaling out computing, and all of that is essential. But as we go to more customized silicon, human talent is the limiting factor. We see the system companies struggling with how they staff internally. We’ve seen a lot of opportunities for semiconductor companies adding ASIC services. But there is a limit. If you have one senior engineer, how do you take AI and ML and apply it to our tools so that a junior engineer can become effective. There’s tremendous low-hanging fruit there, and we should be able to apply tools to harness AI/ML so that a human can look at it and say, ‘That’s a problem,’ and fix it.

Harms: You need to compute, but overall you have to make sure you have simpler compute architectures. This will be combined into multiple chips. We definitely need to scale out, and we need EDA methodologies and flows to handle that kind of scale out.

SE: There seems to be a lot more application-specific knowledge going into designs than in the past. In the past, you would push a chip to the smallest node possible. Now there are more respins based upon knowledge in a particular design.

Harms: Sometimes we are planning the second step already, so you get 90% of the fundamental functionality in because you want to try it out in real silicon. And you also do simulation up front. The spec from the customer always changes. So simulation continues, but in smaller increments.

Boning: I’ve worked on design for manufacturing for years, mostly dealing with manufacturing variation. What is very clear is that variability is coming in operating conditions, in the interaction with the environment, with the use case, and so on. So getting a chip right that’s going to answer all of those questions on a first spin is becoming harder and harder. You still want to try to do that, using as much of the current methodologies as you can, but there also is an emphasis — maybe taken from the software industry — on planning more spins with smaller bites. Rapid learning on all of these interactions is crucial. You need to gain that experience and feed it back into the physical models, augmented models, and capture some of the subtle effects that you gather from data in a quick spin that can go into a machine learning model. There are subtle interactions from the physics that you didn’t even know were there. We need to move toward faster iterations, if possible, and not try to solve everything at once.

Lee: We do see more respins, and we see a lot of challenges in time to ramp and time to volume. In the areas that we’re most sensitized to, we find you can’t fix what you can’t measure. What we’re seeing is that the margin-based methodologies being used have an impact on cost. And what we’re seeing at advanced technologies is those margin-based methodologies are conservative. Voltage and timing is one where there is a lot of awareness. If you look at PVT and process variability, voltage variability and temperature, these resources can be modeled much better. Voltage timing failures are real, and there is an opportunity there. The second source of models that can’t be fixed is on-chip electromagnetics. Now we have on-chip SerDes, high-bandwidth digital signals, and the on-chip inductance effects are extremely destructive. It’s a huge challenge for designs these days. Respins are increasing, and from our perspective it’s due to multi-physics and the need to model it better.



Leave a Reply


(Note: This name will be displayed publicly)