Experts At The Table: SoC Verification

Last of three parts: Design variables for different markets; what’s good enough; uncertainty vs. innovation; big vs. small IP suppliers; future challenges with stacked die.


By Ed Sperling
System-Level Design sat down to discuss the challenges of verification with Frank Schirrmeister, group director for product marketing of the System Development Suite at Cadence; Charles Janac, chairman and CEO of Arteris, Venkat Iyer, CTO of Uniquify; and Adnan Hamid, CEO of Breker Verification Systems. What follows are excerpts of that discussion.

SLD: Is the amount of time necessary to achieve reasonable quality going up?
Iyer: There is no one size fits all. We’ve done devices that have been implanted into the human brain, and there is a huge reason to keep those chips simple. They can verify the heck out of it and be 99.998% confident. There are chips we know of in consumer products where there are constraints in the design and in the verification. In that market, time to market is more important than 20% of the functionality.
Schirrmeister: It all depends on the application. There are application domains where the time absolutely goes up. Where it is mission critical, there is no other choice. There are other application domains where your lack of confidence by the time you go to tapeout goes up. In the consumer domain you have to tape out on a certain shuttle time. Or you do more verification by executing more cycles or you do smarter verification.
Janac: When you’re running an SoC project you can play with scope, resources or schedule. In markets, such as satellite analysis, you can play with scope. In some telecommunications markets, you can play with schedule. And in some markets you’re playing with resources. In mobility, the schedules are decreasing but the amount of resources being thrown at the problem are getting much larger.

SLD: People and machines, right?
Janac: Yes.
Hamid: But there also are natural limits. There is something magical about 2,000 test cases. Every team seems to have that number, not 2,500 or 1,500. In some cases we are taping these things out without fully verifying them, and then we’re taking the hit in the software arena. The problem is functionality and performance over time because of all the bugs in the hardware. All of the big companies are looking at why it takes so long to get the software working. It’s because there are bugs in the chip and it takes so long to find them and work around them.
Janac: As one of our customers said, ‘Christmas waits for no man.’
Iyer: Neither does CES.
Janac: The schedules are fixed in some of these markets.
Iyer: So are we taping out and verifying less than we did in the past? The answer is yes, because we are trusting IP providers to verify more. Years ago, if you had an ARM chip you’d run many more vectors and simulations than now. With a 100-million gate chip, there’s no way you can run C code in a simulator.
Janac: ARM has been verified thousands, if not tens of thousands, of times in chips. And their verification process has improved over the years.
Iyer: Yes, ARM is one of the most mature chips we integrate. But ARM also is a hard negotiator and pricey in terms of royalties.

SLD: Does this mean we go to the lowest common denominator because it’s been verified so many times? And does that, in turn, mean that the enormity of the verification challenge will stifle innovation?
Janac: No, but people won’t necessarily go with IP startups. They’re going to go with innovation because they have to. A 20nm chip has to do way more than a 28nm chip. But they’re also going to rely on fewer, more integrated, larger suppliers with a quality methodology they understand.

SLD: Isn’t that a way of stifling innovation?
Janac: Only if you assume that the larger companies aren’t innovative. That may be the case, but it also may not be. The standard bodies define USB 3.0 or MIPI LLI (low-latency interface), so everyone uses them. If you look at things like UNC (Uniform Naming Convention) and virtualization and resilience, which are all coming to the SoC—those are very innovative. But they have to be delivered with so much quality that only sizeable organizations can afford to do that. And they have to be characterized.
Iyer: I disagree that the big guys will get all the IP wins in the future. They will be the largest in terms of revenue, but there is lots of room for smaller players. It’s progressively getting harder and harder to get something out of these established companies—particularly for small companies.
Janac: So you’re trading off cost against quality?
Iyer: No. The big guys can afford to spend the money on an IP controller. But there is also a lot of opportunity for new players to come in.
Schirrmeister: The better and longer your track record, the more people will trust you. But what’s interesting about the IP space from the system-level side is companies are trying to figure out whether you can raise this all a level of abstraction. Can you abstract this up a level and create more adaptable IP? That’s a possibility. I don’t believe innovation will be stifled, though. There will always be ideas out of left field. There is lots of potential to automate certain portions with high-level synthesis or higher-level approaches.
Janac: There will be innovation. But for a company running a $300 million mobility platform, will they take a chance? The answer is, no. They have to go with someone they trust because they can’t do all the verification required with a fixed schedule.

SLD: You certainly have to pick your battles, right?
Janac: Exactly.
Iyer: The question we get is whether we’ve worked with IP before. One of the problems we have with the large IP players is they’re not always willing to customize the IP. They don’t cater to customization of IP, like when you want it to do one little extra thing.

SLD: But doesn’t that vary, depending on what the IP is used for? You don’t necessarily want to change a USB.
Iyer: Yes, for standard IP it’s supply and demand. If it’s standard on every PHY, it’s who are the players and can they be trusted.
Janac: We’re talking about cost against trust tradeoffs. For some people that make a huge investment, jeopardizing that trust isn’t worth it. For others, it’s worth the risk.

SLD: As we start stacking die, things that worked fine in a planar architecture may no longer work. How do we verify this?
Janac: You need to build in self-test. There will be test structures.

SLD: But isn’t that extra margin?
Janac: Of course.
Hamid: This problem is not new with stacked die. Even with silicon today, you can only verify so much in simulation, emulation or an FPGA. Ultimately, the real PLLs and drivers and analog circuits have to be tested after the chip comes back. You still can’t get enough cycles with emulation. And you need good tests to find any issues and really stress chips when they come back. It’s not just trust versus cost. It’s also trust versus testability. Once we have tools where I can test to see if IP is good, rather than just relying on the IP vendor doing the tests, that opens up the market.
Iyer: We’ve been working with variation-aware IP at the system level. We’re translating that not just to IP, but for other mission-critical things. As the environment around the chip changes, the chip can adapt to the environment. The design challenges and the characteristics are different if you stack DRAM on top of an SoC, and the design has to be different to handle different environments.
Schirrmeister: There are three angles. One is make it right in the first place. That goes back to having the right architecture. At the back end you need the right testing capabilities. And then you can do the middle, which is to make it flexible with variation-aware IP. Whatever people can push into software, in order to keep it flexible, they will.
Janac: But for a 3D-IC, it goes back to trust and verify. You have to believe you have superb quality at the die level, and then you have another level of testing problems. You have to have the test structures and the methodology to test this whole die stack. It adds another layer to the test problem. But if all the underlying pieces aren’t tested and proven to work, then the 3D chip will be dead. It adds another layer of quality problems, which is why the packaging houses will become so important in the future. But it’s a problem that has to get solved for the semiconductor industry to move forward. You can’t put everything on a 14nm die. It won’t happen. You have to go 3D because only certain things make sense to put on the latest process.