Big Shift In SoC Verification

Experts at the Table, part 3: Dawn of the glue engineer; the search for a common language between hardware and software; system-level coverage; finding more bugs faster and more inexpensively.


Semiconductor Engineering sat down to discuss software-driven verification with Ken Knowlson, principal engineer at Intel; Mark Olen, product manager for the Design Verification Technology Division of Mentor Graphics; Steve Chappell, senior manager for CAE technology and verification at Synopsys; Frank Schirrmeister, group director for product marketing of the System Development Suite at Cadence; Tom Anderson, vice president of marketing at Breker Verification Systems; and Sandeep Pendharkar, vice president and head of product engineering at Vayavya Labs. What follows are excerpts of this discussion, which was held in front of a live audience yesterday at DVCon.

SE: Do we need a new kind of engineer to bridge the hardware and software worlds?

Pendharkar: Some customers are taking up that approach. One manager has hired three software engineers to be part of the design verification team. These software engineers write device driver code in C. The verification engineers then sit with them and try to understand it and apply it to the block-level side of things. It’s difficult to get people to understand this, but it might be like having software guys be part of the verification team.

Schirrmeister: A lot of problems we have today are hard to define as either hardware or software. If try to find leaks in your embedded software, the problem becomes whether it’s a hardware or software problem. The only thing to find it is to look into the virtual platform and check into the memory that has not been released, so it has to be tested from the hardware perspective. That’s why I’m wondering if we need a new species. They need to understand both sides and the problem has to be defined in a way so it’s a true hardware-software issue.

Knowlson: We definitely see ourselves developing systems engineers—they’re really more like glue engineers. The glue is to know enough about the specific firmware that you can take it and modify it, and then work with the design guys. Getting to the software guys early enough is really tough. You can get the architects. But getting people who know the code early enough is always a challenge.

Schirrmeister: What we EDA vendors are trying to do is to create an environment where a moderator—let’s call him Dr. Phil—brings in a hardware and a software guy, they both look through their different perspectives at the same problem, and that glue person is becoming the moderator. He knows enough about both sides to be dangerous and to challenge a bluff from the software or hardware guys that it’s not their problem.

Anderson: And we need a common language.

Schirrmeister: The common language is an important aspect.

Anderson: Where people are handwriting diagnostic software for the purposes of verification pre-silicon, the customers tend to be embedded programmers on loan to the verification team. The verification team members are experts in SystemVerilog and test benches, but they don’t necessarily know how to write embedded code. The embedded engineers who are on loan have to learn something about RTL and simulation. At the same time, some knowledge is going to rub off on the verification team about embedded software. It may not be a new breed of engineer, but there is a hybrid skill set occurring as a result of this cooperation. If you’re going to more of an automation approach, sometimes it’s the embedded folks and sometimes its’ the verification team.

Knowlson: For the complex IPs with embedded processors we have dedicated firmware validation teams to validate the IP. But they’re just validating the IP. They’re not validating it in an SoC context. And which SoC would they pick?

Schirrmeister: If you look at companies like National Instruments, they do a lot of testing. While lot of the things we do pre-silicon is verification, there is someone on the back end using LabView or other test environments to test the chip post-silicon. Does this profession already exist using a representation like LabView, and can some of this be applied earlier in the flow.

SE: Are we getting to the point where we really can have good coverage of a complex SoC, or do we need to rethink how we apply coverage models?

Olen: You can have good coverage at different levels of your design, from block level to subsystem to full system, but your covering different things. Don’t think you can re-use your SystemVerilog models from the block level at the system level. You’re not covering the same thing.

Schirrmeister: That leads back to the flow. One company told me that everything they do higher up at the system level is additional effort. It is only something they can do if it uses the overall verification effort from all levels. If you really can go up and verify something at a higher level, you better make sure you have enough automation in place to auto-generate from there and make sure you don’t have to re-verify that one level down. Otherwise it was wasted effort. The question is whether those other models are used for high-level synthesis—the TLM design and verification flow—and whether they can be synchronized effectively with the models used for simulation, verification and TLM-based virtual platforms. Once you have that level of automation in place, then you can go up and do this additional level of verification.

Pendharkar: The notion of coverage at the system level needs to be enhanced, because when you are talking about the system-level there are a bunch of things you want to know if you are doing software-driven verification. You want to make sure all your IPs are exercised properly, including all the interactions that are possible between these IPs are verified. There are a lot of shared resources, so you want to make sure those get verified effectively, as well. Also important in the system context is hardware-software interaction scenarios. When we talk about coverage at the system level, all of these need to be addressed.

Knowlson: Do you have to cover all possible interactions?

Pendharkar: That’s a good question. It depends on what you are verifying. If you talk to a consumer electronics company or a set-top box maker it will be different than a car.

Knowlson: If you don’t know what it’s going to go into, you have to. That’s the hard problem. We need to figure out how to do less and get equivalent quality.

Schirrmeister: You do less by only dealing with the relevant pieces.

Knowlson: Exactly.

Pendharkar: The answer is very different depending on who you talk to. Some engineers will ask, ‘Why do I need to run all the test cases?’ If there’s a bug, they will just push out a software patch. But then I talk to other people and they are so paranoid that every test case, starting with simulation, they want to be able to reproduce all the way to post-silicon. That is because they don’t have a way to achieve more with less. The alternative is over-engineering.

Olen: And by doing that manually they can guarantee that they’re covering 100% of the tests that they wrote.

Chappell: We talked about a common language. When we say language we think of syntax. It’s not necessarily a syntax, though. A language is just the way that we communicate. So perhaps the coverage is a way that we can start communications between the software people and the hardware people. And if you get that early in the planning process, ‘This is what we’re trying to achieve from a verification standpoint and this is where you do your functional coverage and code coverage, you go do your use-case coverage, and how do I split that up given the different engines that are available to me. There is that coverage aspect, and then there is the platform aspect.”

Knowlson: And it has to be seamless. I don’t care if it’s one tool or three tools, as long as it’s seamless.

Chappell: Going across all those levels of abstraction, the lines start blurring between hardware and software. They blur between simulation and emulation.

Schirrmeister: This goes back to Tom and my dispute at the beginning of this discussion. There has to be seamless integration of what you do in one engine and another.

SE: Will any of this help get the cost down for developing chips? Estimates for SoC verification run as high as 80% to 85% of NRE.

Anderson: If you look back 15 years ago, when people did verification they hand-wrote every test. Then constrained random verification came along and automated, to a large extent, what happened with the test bench. We see, whether it’s a graph or some other technology, the idea of auto-generating test cases you’re using to validate an SoC, running on the processors to simulate and beyond, as a similar sort of automation. You’re taking 10 or 15 embedded engineers on loan to test to run simulation and emulation and replacing them with automation. That has to result in time savings and dollar savings.

Schirrmeister: The reference point is important. I would be very careful in claiming reduction in cost. What we do is keep the cost in check by higher productivity to deal with the complexity. The complexity is still growing faster than the productivity of all our tools and methodologies. If the reference point is not increasing complexity, then it does reduce cost. But the complexity grows so fast that absolute cost is not being reduced. Verification is still an unbound problem. I don’t know anyone who comes to the point where they say everything is done in verification. They are all sweating when they do the actual tapeout that they hope they haven’t missed the one thing that changes their career.

Olen: Even with the graph-based technique that can get you to your coverage goals a lot faster than constrained random testing can, customers don’t say now they’re done. What it usually does is cause them to peel back the layer of the onion and say, ‘I’m not covering what I thought I was covering and I need to expand my goals and verify further.’ In the end you fill the time with as much verification as you can get done because it’s an unbounded problem.

Schirrmeister: Automation does help to deal with the complexity that a designer can’t do by hand. You get enough test through automation, and the tests underneath them.

Olen: But if people want to take advantage of some of the new areas of automation, they also have to be willing to look beyond what’s available today. The standards committees and communities are great for helping us with interoperability, but maybe some of this automation requires us to look beyond standards. A standard is creating a common interchange for something we already know today. We’re not looking for something incrementally better. We’re looking for something that is 10X or 100X more efficient. We should probably have a panel session on whether standards are holding us back.

Chappell: We also need to look at what part of the problem we’re solving here. When you’re looking at the overall cost of verification, whether you’re using graph-based or directed test, you still have to write your test bench. What part of the sequence are you going run on simulation versus emulation? How do you set up your prototype and emulation platforms? If you’re looking for productivity and performance to reduce the total cost of verification, you have to look at that overall big picture. You need a seamless learning curve.

Pendharkar: Are we looking for reducing cost or improving the quality of verification. I would like to believe it’s all about improving the quality of verification. Hopefully then the cost is taken care of automatically.

Anderson: Yes, what’s the cost of a product recall? Customers want both. They want us to find bugs and faster.

Schirrmeister: And they would like it to be cheaper, too.

Knowlson: Automation is definitely something we need to do. I don’t see the validation discipline going away with production software. Frankly, you can’t do a whole bunch of things in pre-silicon with production software. A lot of production use cases have to do with user experience, timing, performance—very complicated things that even if the infrastructure could do it would take weeks to run. So please don’t depend on production software to solve all the problems.

Schirrmeister: But I do see a lot of people using Linux like a test OS. There are companies doing specific, targeted OSes. It’s definitely a growth area for EDA to deal with this.

Chappell: And it gets back to what engine you’re going to use. You’re not going to boot Linux in simulation. Maybe if you have a really fast emulator you can do that.