Overcoming The Limits Of Scaling

Experts at the table, part 1: Complex tradeoffs involving performance, power, cost, packaging, security and reliability come into focus as new markets open for semiconductors and shrinking features becomes increasingly expensive.

popularity

Semiconductor Engineering sat down to discuss the increasing reliance on architectural choices for improvements in power, performance and area, with Sundari Mitra, CEO of NetSpeed Systems; Charlie Janac, chairman and CEO of Arteris; Simon Davidmann CEO of Imperas; John Koeter, vice president of marketing for IP and prototyping at Synopsys; and Chris Rowen, a consultant at Cadence. What follows are excerpts of that conversation.

SE: As we move down to 10nm and 7nm, we can’t just rely on device scaling anymore for power/performance improvements. What do you see as the big issues going forward and how do we solve them?

Mitra: The heterogeneity in SoC designs will continue to grow. We started off having big compute engines that did one thing or another. With SoCs, we are mixing and matching and putting a whole system on a chip. How do you analyze for heterogeneity? How do you design effectively for heterogeneity? Do you use a coherent or non-coherent system? Whether you’re looking at that for automotive or IoT, you’re looking at varying kinds of data and making real-time decisions very fast. At an architecture level, what is it that we have available to us to allow those kinds of designs to be done fast, to be verified and production ready? These are all questions that are being asked.

Janac: Architectural changes aren’t just driven by technology, though. They’re also driven by function and application. It used to be that the semiconductor industry built chips and people would program for them. Now, a lot of major designs are being driven by the evolution of software business models, so the hardware is being designed for software. And SoCs are being assembled out of IP, rather than being custom designed from scratch. So there may be several blocks that are custom for differentiation, but the rest is assembled from pre-made blocks. That’s one shift. Another big shift is that big OEMs are starting to design their own chips. Those are under different economic constraints than those developed by semiconductor companies. In the semiconductor industry cost is a major parameter. In the OEM world, you can absorb some of the costs of the silicon by the software business model revenue. Even the second-tier OEMs are looking at doing this.

Davidmann: One of the biggest challenges, whether it’s homogeneous or heterogeneous, is to ensure the architect will be able to deliver the software. All parts are defined by the software. The question is how you get models early enough in the design cycle to be able to explore how it’s going to work. So you can put a large processor in there, which is a standard part, but you may want it to be slightly different. And you have to get the software up and running almost before you’ve committed to it in silicon. That gives the architect the ability to still move it all around. You don’t want to load the software and find out you really needed eight cores but you only have six.

Koeter: From an architectural level, system-level performance is a big issue. Being able to model the processor through the interconnect and out to the memory interface, as well as deal with cache coherency issues, are critical to the overall performance. Also critical at an architectural level is power optimization and management. Those are two things that are really important. That traditionally has been done with Excel spreadsheets. More and more we’re seeing that being done with SystemC TLM models, with tools and infrastructure that all of us provide, along with being able to model the power and performance in the context of real traffic.

Rowen: The biggest challenge is the complexity of software. It’s getting harder—and even relatively harder in relation to the hardware. There is an opportunity for heterogeneous integration as you suck more and more into one chip, but there are limits even there in terms of how many things you can suck in. So people are trying to make the system more complete, which means faster and more capable. It may be increasing the number of CPUs or using different hardware engines. There are plenty of transistors to throw at a problem, and that will be true for years to come. We have much more of a problem in terms of how to get a useful experience out of those transistors. It’s easy to scale up if you’re just throwing more memory bandwidth at the problem. The growth of memory in most chips exceeds the growth of logic, in part because it’s harder to design logic. So you add more and more performance through memory. The challenge there is how you bring more and more memory on chip. On top of all of that, there are new approaches such as neural networks, deep learning, and other kinds of machine learning, that are changing the nature of the game and even what is software. You don’t program it. You train it. That’s a complete sea change not just in how you do it, but also in terms of who does it. That’s complemented by questions like, ‘How do you do submicron design in the presence of process churn? How far do you elevate verification tasks so you’re not just dealing with functional correctness?’

SE: What’s the critical feature here? Is it the best design of the processor? Is it time to market? Is it cost?

Mitra: Yes, to all of them.

Janac: But it’s not all of them at the same time. So you’re dealing with all of those criteria in a design space. I was talking with a semiconductor company, and they were saying that if they were approaching designs like an OEM they would be out of business. The OEM chip was about 30% bigger, but the OEM doesn’t care. It depends on the market segment. If you’re building a chip for a data center, you don’t care as much about area. If you’re building a low-cost consumer SoC, power is what matters.

Mitra: This is all the application-specific customization that is going on. With each application in a data center, whether it’s mobile or not, they tune it to optimize it.

Koeter: There has definitely been a change in terms of how fast can you make your processor run. That was the primary benchmark in the past, and with the server and cloud, it’s still about performance. But for the rest of the market it’s about cost, and that’s true for every segment. So how do you make a chip a little smaller and lower power and get it into a cheaper package.

Rowen: And get it to market faster.

Koeter: Yes, absolutely.

Rowen: This is an interesting dichotomy that’s emerging between the system OEM, who’s now integrating the silicon and looking at the cost of the chip as part of the bill of materials. They’re less sensitive to the cost issues and more sensitive to the functionality. Getting the cost down has always been the biggest obsession among traditional semiconductor companies, but it plays out differently when you’re doing it internally and comparing it to the cost of a flat screen television. Would you save 50 cents on the die cost if you got out to market more reliably or with a better feature or more battery life. The answer is no. If you go to a traditional semiconductor company, though, 50 cents is a lot of money.

Koeter: Back in the days of 1 micron, people were saying that silicon was free. Here we are at 16/14/10nm, and silicon is still not free. Design engineers squeeze every gate out of the chip. And they should. If you’re talking 16/14nm, and you can knock a square millimeter off your chip, you might be knocking, in rough numbers 15 cents to 20 cents off the cost of your chip. That’s significant.

Janac: At 20nm, 1mm² is roughly 10 cents in high volume.

SE: We have some new factors that never entered into design in the past. One is security. The second is much greater reliability, which can mean 7 to 10 years in automotive and 15 years in industrial. How does that effect the architecture?

Janac: It depends. We have some customers who view silicon as disposable. If you try to sell security and resilience to those customers, you’re not going to get very far. On the other hand, if you have a mission-critical system like an automated driver-assistance car, those concerns are very important. Cost isn’t the main parameter for the auto guys, even though they always want the lowest price. At the moment, automotive is going faster than consumer, so it gets a lot of consumer. But it depends on whether you are on the mission-critical end of semiconductors or whether you’re on the disposable end, or somewhere in between.

Davidmann: Security isn’t just about the chip architecture. It’s the system architecture. People are designing things into the silicon so they can make the software put things in silos and compartments. They’re using much more than just the architecture of the silicon, though. They’re putting things into completely different places, so if you break into one area you can’t get into the other.

Koeter: At a chip level, you have to have a hardware root of trust that enables a trusted execution. You have to be able to uniquely identify that silicon, authenticate it, identify it and do key management. So do you have a security perimeter? A dedicated hardware root of trust? Or do you add an extension around your hardware that adds some security? We do that. ARM has something similar.

Davidmann: If you really want it to be secure, you have to design it from the bottom up. We have one customer that designs its own instruction set. That doesn’t go out the door, because they’re so worried about safeguarding how it works. Everything rests on that.

Mitra: On the security side, it is a system-level solution that requires a hardware root of trust. But it has to be software working in conjunction with hardware to make it a really secure system. The other thing is that when you look at the automotive market, security is a real-time decision. You need to make very quick decisions. In IoT or the cloud, decisions can be made slightly later. So the response time to security has to be almost instantaneous. You need to understand whether it’s a threat. To answer the other part of your question about reliability or resilience, at the architecture level what can be done is to add redundancy—making sure there is error correction, for instance. The other side is at the back end side. How do you implement that redundancy? Will that impact the architecture going forward? Perhaps, because you have design parts that are military grade from a circuit level, and you are wondering why no one took care of these issues at a much higher level. Some customers will pay for it because there is overhead to designing chips for 10 years.

Related Stories
Stepping Back From Scaling
New architectures, business models and packaging are driving big changes in chip design and manufacturing.
Rethinking Processor Architectures
General-purpose metrics no longer apply as semiconductor industry makes a fundamental shift toward application-specific solutions.
New Architectures, Approaches To Speed Up Chips
Metrics for performance are changing at 10nm and 7nm. Speed still matters, but one size doesn’t fit all.
Focus Shifts To Architectures
As chipmakers search for the biggest bang for new markets, the emphasis isn’t just on process nodes anymore. But these kinds of changes also add risks.