Experts at the table, part 2: IP vendors are facing new challenges and for some of them, such as security, the tools and methodologies are not in place.
Semiconductor Engineering sat down to discuss the limitations of simulation in more complex designs with , CEO of ; Pete Hardee, product management director at Cadence; David Kelf, vice president of marketing for OneSpin Solutions; Lauro Rizzatti, an emulation expert; and Arturo Salz, scientist within the verification group of Synopsys. In part one the panelists discussed the increasingly important role that formal and emulation are making. What follows are excerpts of the next part of that conversation.
SE: A chip is well defined and has a single function, but IP is configurable. It is not just one design and there is increasing amounts of software that get placed on top of it.
McNamara: You are shipping firmware with that IP and that gets integrated with other firmware and that provides another chance to do it wrong.
Salz: That is where the big gains happen. You have to have a platform that supports power, but the value-add comes from the firmware.
Hardee: The firmware is important, but thinking about the configurability. First, the fundamental, exhaustive nature of formal makes it possible to check every possibility for every combination of inputs. That becomes even more important for configurable IP. This is why we are seeing an even faster growth of formal within IP vendors compared to the general market. Second, we are seeing an interesting application of sequential equivalence checking for configurable IP. When a configuration change is made, they are using it to check that certain things do change. You know what should change but nothing else should change. The checking that nothing else changes is also important.
Salz: In the present systems there is so much redundancy built into the system that a functional hardware bug may not manifest itself as a failure in the system. It may result in performance degradation. Typically, within mobile systems, we are seeing things take more power than was expected and it is usually because software forgot to shut something down. In communications you may drop some packets, but there may be a retransmit, so it can be very hard to find these bugs. Formal does not help with this even though it may help to locate it, but it is only manifested at the system level. Random constrained verification is like pool sweeping. The constraint is the shape of the pool. You don’t have to give it much and given enough time it will do the job.
Rizzatti: This is a great role for . It is orders of magnitude faster than simulation.
Hardee: There are some bug hunting techniques in formal that are doing very similar things now. You are using the properties to drive bug hunting and also using massive parallelism. It is sometimes called a bee swarm. One technique is to use elastic bounded model checking (BMC) that allows you to leapfrog the state space. You do not look at every state but random leaping states to penetrate the IP in very interesting ways. When you do that with a swarm of engines with different seeds you can get a parallelized search that does hit every state and penetrates deeply. This is a formal version of constrained random.
Salz: Random and formal are the two workhouses of the verification industry and we will continue to use them. But when the pool becomes the Pacific Ocean, you will not want to do that. It will never have enough resources. When you throw software into the equation, the state space becomes enormous. The formalisms that have been developed for hardware design do not exist in software. You have a stack and a heap and you cannot write properties for that. Even with emulation you need the test that you want to exercise the corner to corner case and it gets very hard. You also get into problems with porting stimulus from the IP level and reusing it at the system level. These problems are not solved yet.
McNamara: Another unsolved problem is caused by us disaggregating how we buy IP. Systems companies buy major blocks from a bunch of people and then integrate them together. It used to be the tool that said: you have a property that says A will be asserted before B and is violated on line 107,056 of file XYZ, but I cannot read that file. Simulation may tell you that there is an X here, but I cannot find out why. So, how do you give a message to the person who is reading it that provides enough information to understand what happened? There is an analysis problem that used to be solved by having the experts.
Salz: The question is how you package your IP so that you give actionable messages to the end user. The next thing, as an IP provider, is that we need a testcase to reproduce it.
McNamara: Yes. All of our tools have assumed that the reader of a message is able to take that and do something with it. Now that person may be two or three levels removed from the person who coded it. So how do you convey enough information? It could be in the firmware which would add more layers into this. It provides a new challenge. An SoC is an aggregation of a couple of a dozen sub-systems, and each of those may have aggregated other IPs and then you add firmware.
Hardee: It is like the needle in the haystack. You could be finding that anomaly how many cycles deep in an emulation trace. We are looking at the cross pollination of technologies and how formal debug techniques can start helping with simulation and emulation. If you can capture the anomaly, you can describe it as a property and then run formal to find the shortest path to get to that problem. That can be very enlightening.
McNamara: I agree but what if I am in a systems company and I bought IP from various people. What I need is a message that says go talk to Mike McNamara at AdaptIP who is the person responsible for it. If I can get the problem back to the person who owns the IP, then they can do something with it. It could also be that when this IP is used by everyone else it is fine, but when used in combination with this and that, then it may not even be clear which guy is important.
SE: A lot of IP wants to appear black box, but doesn’t that mean that you have to provide more visibility?
Kelf: There are techniques, such as witness generation, that can put up a nice trace with minimal depth. But the question is: how could you provide that in such a way that it does not violate any black box condition. You want to be able to see the trace going through to the place that provides enough insight for the IP vendor and the folks constructing the SoC. How can that be generated while not providing too much information about rest of the design? There may be a contract for that.
Salz: It is constantly renegotiated. The tradeoff between visibility, actionability and security is something you have to shuffle all of the time. People are shipping encrypted IP and they don’t want people to see that. There are often two companies that don’t completely trust each other and they are sharing IP that is encrypted but when there is a bug…
Kelf: You probably have to ship out the engineer and then they are sitting with the person who found the bug and they want a tool that can help them trace it to that point without revealing the rest of the IP. A tool that can provide the information necessary to find the bug. Formal may be able to provide that narrow focus.
Salz: But you can’t display that information to the end user. A methodology has to be developed around this problem.
McNamara: And when you put security in there, it had been the case that there was an understanding based on both having enough money to hire lawyers and that was a way to break through the barrier. But this opens you up to security issues. You can’t do that anymore.
Salz: There are problems of authentication and security. There is leakage seeping in from the overall domain into IP. We are not used to that.
McNamara: Before security, it was just a financial issue. I want you to buy my IP and I want you to buy it again for the next chip. But if we get to a point of trust and give them unencrypted IP, we can still assume that if they want a 1GHz widget today, they will want a 2GHz widget next time so they cannot use the old design anyway. But with security, it is an issue of I trust you but I cannot show it to you for security reasons. But there is another side to this. If it is hidden, then back doors may not be observed except by the people who inserted it.
Salz: Within EDA we supply the encryption capability but it is the communications and the ends are owned by other people. When you ship software such as Open SSL, anyone can bring up a debugger and put a breakpoint in there are it is quite easy to find the math functions and there it is. It is easy to break these things and if you are really devious you can do it. How do you harden a platform when you are shipping software to the end user who may not be the end user, instead being two or three steps removed?
Emulation is just hardware assisted simulation, and it has some security issues because it uses the actual logic of the design. Behavioral models of IP are more secure and can go fast if you use GP-GPU platforms for your simulation. I.e. I can give you the executable spec for an IP block that the gates match, and it’ll run fast and there’s nothing needs encrypted.
The argument I hear most often against that is that behavioral models do not have timing accuracy and that this is necessary for getting close enough estimates for power. When you put in the necessary accuracy, you lose a lot of the performance plus you have the problem of proving that the behavioral model and the RTL are a perfect match.
RTL is a behavioral model. Behavioral models can have the same accuracy as models made up from smaller components (e.g. SPICE level), it just depends what you want to trade-off. E.g. there’s no point in having a model that works at 5V if you only use 0 – 2V – just add an assertion for being over 2V. Emulators are pretty useless for anything analog (including power), as is regular Verilog. Try verifying DVFS or FD-SOI back biasing in either.