Teklatech’s CEO talks about the challenges of scaling and how to minimize IR drop and timing issues.
SE: How much further can device scaling go?
Bjerregaard: The way you should look at this is Moore’s Law provides some valuable benefits in terms of performance, power and cost, but people tend to forget the underlying physical challenges that need to be solved to exploit these benefits. On one side are the manufacturing guys with all their innovations in material physics and making the transistors possible—and more and more of them. On the other side, there are the system specification folks who design the system and their demand that we exploit these benefits from the manufacturing. Strapped right in the middle of that is the back end, which is tasked with actually making it happen.
SE: There are rules to improve yield on the manufacturing side, but they are getting more restrictive, right?
Bjerregaard: Design rules are where the manufacturing folks say, ‘Look we’ve built this new device at 7nm transistor, for example, and it has all these wires, but in order to make it work, you have to do many things.’ We do that, but there are still challenges in terms of getting the area down, area utilization up, and solving timing and power integrity issues. This all has to do with the challenges of making that huge logical system work in the physical world. We see power integrity as becoming a central pivoting point for getting benefits in all areas. It used to be that power integrity was just a checkmark.
SE: That was above 40nm, right?
Bjerregaard: Yes. But the problem is that power integrity starts affecting your PPA directly. In the past, if you achieved this level of voltage drop, then we could guarantee the libraries were characterized and therefore your timing and power will be okay. But now with the scaling technologies and scaling of the other parameters like supply voltage, the power integrity becomes a much bigger part of the full picture. As we scale down to a 500 millivolt supply voltage or below in some designs, having a 10% margin for IR drop is not a lot. But in addition, that 10% has a higher impact on performance and total power. What we are seeing is that power integrity becomes a strategic part where designers can leverage benefits to a much higher degree than they could before.
SE: We’ve got power that goes off into multiple power domains and you’re trying to split it up, which is an offshoot of dark silicon. We’ve got dynamic power, leakage, thermal issues, and we’re looking at horizontal nanowire gate-all-around type FETs and vertical nanowires after that. How does all of this fit together?
Bjerregaard: Power integrity issues are not going away. They are becoming worse. You have faster switching times, higher portion of dynamic power, decaps are less effective, and so on. Even achieving your goals in terms of maximum IR drop is a problem. And that becomes a larger and larger problem. For example, with vertical gate transistors, it’s just continuing the trend that the switching time will be faster. What you saw with finFET versus non-finFET was that the impact on delay of reducing the supply was smoother. But the problem is that when you hit the wall, you hit it hard, and the device stops working. That trend continues. When you have these very fast switching times in combination with not very effective decaps, you get quite a high degree of local dynamic IR drop, which puts local devices out of commission. They stop working. The whole concept of achieving your targets for dynamic voltage drop is in itself a problem. At the same time, these transistors become more sensitive to the effects in terms of parameters that really matter to not only getting the performance you require to compete in the market, but also to get to profitability. The semiconductor business model is basically broken. It broke at 28nm.
SE: And it isn’t getting any easier.
Bjerregaard: No, exactly. One of the reasons that the cost per transistor is not going down is the yield is bad. Another reason is that the area utilization is bad. If you’re targeting 75% to 80% utilization, the decreasing cost per transistor assumes that you keep getting that utilization. If you are suddenly getting 65% to 70%, you’re just leaving money on the table. The wafer cost is the same whether you manage to cram in a billion transistors, or 1.1 billion transistors. It’s a question of the business model in the semiconductor industry with the extremely high levels of competition, especially in the mobile space. That profitability is under serious threat.
SE: What’s the solution?
Bjerregaard: Designers are realizing that they need to start working more intelligently with the design optimization. The benefits of scaling don’t come for free anymore. You can’t just go to the next node and do what you used to do and expect the place and route tools to do enough. You need to work more to design-specifically with optimization. When you optimize in one parameter, like dynamic power integrity, you can reduce headroom to gain benefits in other areas. For example, how much metal do you really need to implement your power grid?
SE: That’s an interesting idea. It’s basically looking at it from the standpoint of what don’t you need in a design.
Bjerregaard: One of the reasons this is becoming so relevant is because semiconductor companies are under pressure. They’re forced to think about it. It was a luxury to be able to go along and deal with it as they needed it. It also made the industry sort of lazy. You didn’t have to figure out what you didn’t need. You could work with the margin and headroom. But the truth is that most designs are overdesigned in some way. Most designs are designed for the critical parts—critical timing paths. The power grid is designed for the parts that need the most power. There are lots of parts to the design that are not routing-critical or power-critical, and we’re wasting a lot of resources there. From a power integrity perspective, it’s about design being done homogeneously. Typically, even today the power grid is created as a homogeneous mesh of wires. Of course it has to designed for the part that uses the most power and that has the critical issue. But we’re leaving a lot of stuff on the table. From a power grid perspective, we’re leaving routing resources on the table. That means that we’re leaving area on the table because most designs at advanced nodes are routability constrained. But we’re also leaving timing on the table because if you have routability issues, you’re going to get timing issues, because you have to route around everything.
SE: One criticism that we hear is that tools and methodologies focus on worst-case scenarios rather than the most common use cases.
Bjerregaard: That’s true, but there’s a reason the design methodologies are the way they are. It was instrumental in even getting to the point where we could create chips with 2 billion to 4 billion transistors without that simplification. However, it’s maybe time to make a change.
SE: The flipside of that is a lot of the processes that we’re dealing with at advanced nodes are not fully baked. We’re dealing with version .9 at best. A lot of times even version 1.0 is not the real version 1, and sometimes IP development starts at version version .01. As we start developing design at very immature process technologies, people are saying they have to build margin into it just to make sure there’s resiliency and reliability in parts and yield is going to be dependent on that. How do we solve that?
Bjerregaard: Poor libraries are a huge problem because it means you have to design margin into your design. But the problem is also that if you start with the poor library and you don’t trust it, then you work this margin into the design and you can’t get rid of it later when you have more mature libraries. For instance, you’ve determined the footprint of your chip, your flow plan based on an early version of libraries, even if you get better libraries later you’ve already made your decision on what the footprint is going to be. We need suites of tools that are more flexible to make more changes by automatically adapting to the changing environment, both from a chip and technology modeling perspective, but also in terms of supporting more heterogeneous design flow where can more fluidly solidify your chip and harden your blocks.
SE: So if you’re at 7nm, you’re looking more at a heterogeneous SoC for a mobile device?
Bjerregaard: All ASICs are heterogeneous if you look at it closely. It’s a detail level I’m talking about, for instance the power grid. You really need this many power straps everywhere in the chip. Can you reduce it here and beef it up a little over there? What do you need to exploit the potential benefit of doing heterogeneous design? It means if you optimize, for instance, for reducing your dynamic voltage drop, instead of just going for a fixed margin, you’d be able to trade that off for routing resources for better timing and even for better power. It does require the models we use are accurate enough. However, you can’t blame everything on the models. If you design it so it’s good by construction, it doesn’t matter about the underlying models. The design will be better than not having optimized it. It’s about qualitative not quantitative optimization.
SE: New materials are coming in, like cobalt replacing copper on the interconnects. What happens with that, and if we start moving to a fan-out or 2.5D where you may still have a 7nm processor in there but you’ll have other parts that go with it. How does that affect what you are doing?
Bjerregaard: If you work at a high enough abstraction level, it doesn’t really matter how your designs are built. If you work to optimize the quality of the design, you’re going to get a better design. Of course, the precise implementation of the design determines whether you get a huge benefit or a small benefit. But there are things we can do, that most designers do not do today, to improve the quality of designs where you get something for nothing. There’s a lot of headroom left on the table with the simplified design assumptions that we use today in the normal standard place and route CMOS designs. This is going to be even more so in heterogeneous systems. Every part of the design will have different needs. IR drop is not the problem. The effects of IR drop are the problem.
SE: It’s whole different way of looking at it.
Bjerregaard: Yes, and it’s okay to have a higher IR drop in this part of the design if it’s not timing critical, or noise critical if it’s a mixed-signal system. It’s not okay where you have a critical part that is next to an analog block. There are different ways to look at it. If you look at it from the customer perspective, nobody asks ‘what’s the IR drop on the chip?’ They ask ‘What’s the noise spillover to my analog part?’ or ‘What’s the total power consumption?’ Taking it from that perspective and looking at the effects of these physical-level issues is extremely important. It requires that we merge the different silos of analysis with IR-drop-aware timing analysis and digital-aware noise analysis in analog circuits.
SE: How high of a level of abstraction can you go with this?
Bjerregaard: It depends on what you’re fixing. If you’re looking at timing, it becomes a plumbing problem because you have to look at that precise part where the timing is not holding and what the timing is like there. But it’s still a system level perspective because you’re finding out what part of the system is the bottleneck. That again turns back to the fact that systems are heterogeneous, but we tend to margin according to the worst case. By looking at system-level constraints or requirements, we can go down and find the bottleneck and then fix it at the detailed level. That’s the traditional approach. But what do we not need in the other parts of the design? Focus has always been on that critical bottleneck to make it work.
SE: Because you’re trying to wipe out some of the silos, does it fit in at the architecture stage? Does it fit in at place-and-route?
Bjerregaard: In the end, solutions to handling heterogeneous systems are done at the physical implementation. The mission of physical design is to relieve the front-end designers from all these problems. We can do a lot at the architectural level—clock gating, power gating, and so on.
SE: You can model some of it too?
Bjerregaard: Yes, but models at a high extraction level would by definition be less accurate. We shouldn’t lure ourselves into the feeling that we can actually model and solve based on modeling. We should solve the problem based on creating designs that are robust by construction. Then the modeling accuracy is not as important. It’s then more of a relative improvement. For example, if you have two power domains and you know that they don’t have to be on all the time, even if they’re on at the same time 90% of the time, you’re still gaining something. The question is when do stop gaining something. When is the overhead of implementing it too high to actually make up for the benefit? That’s the question you can only determine by modeling. Other than that, most things can be optimized where the accuracy of the underlying modeling is not that important. We know we’re getting something for nothing.
SE: Do you find people are getting better at this because movement from node to node is slowing? We’re just getting into 7nm, but the transition to 5nm may be more like 4 to 6 years.
Bjerregaard: The reason it’s slowing down is because it’s just not profitable anymore. It used to be just an automatic economic gain. If it’s not an automatic economic gain, you start to thinking about where you get an economic benefit. You can get it, for example, by optimization and working more intelligently at that.
SE: You may stay at 7nm for 6 years to do that?
Bjerregaard: There’s a steep learning curve for engineers and organizations to understand how they can get the most out of technology. The increasing complexity, both design and technology, makes that learning curve steeper. You can say it’s an intangible asset that’s being built inside the organization working with these advanced nodes—an asset needs to be exploited. If you just skip to the next node, a lot of that know-how is lost.
SE: We’ve seen a lot of companies where they don’t skip the node but they don’t necessarily create chips out of it or they develop a test chip and that’s all they need because now they understand what’s gone on in that node and they move to the next one and actually produce a production chip.
Bjerregaard: Skipping nodes has always been a way to maintain a longer lifetime of the’ know-how’ asset. A lot of companies are underestimating the challenge of skipping a node at these levels because the learnings from 10nm are helping companies deal with 7nm. The ones that skip directly to 7nm won’t know what hit them.
Closing The Power Integrity Gap
Advanced nodes are making voltage drop much worse.
Optimization Challenges For 10nm And 7nm (Part 1)
What will it take to optimize a design at 10nm and 7nm? The problem gets harder with each new node.
A lot of changes had to come together to make near-threshold computing a technology that was accessible to the industry without taking on huge risk.