How To Improve Debug Productivity

Writing test benches is relatively quick, but up to 90% of total verification time is spent debugging. Here are some recommendations from the front lines to improve debug productivity.

popularity

In the realm of SoC verification world, it often takes a very short amount of time to write the testbench and the code, and the rest of the time — up to 90% — is spent debugging. After all, verification is essentially finding the bugs in a design.

Debugging essentially has evolved over the years on the same path and complexity curve as design. Now debugging needs to evolve to keep pace, observed Ellie Burns, product marketing manager for design verification technology at Mentor Graphics.

“Not long ago the debugging requirement was for gates and RTL, and a large design was 1 million gates,” Burns said. “Today many designs are 100 million to 200 million-plus. They have multiple processors with hardware-software requirements, -coherent networks, dynamic object-oriented testbenches, and more. Layered on top of all of this, there are additional standards such as UPF for low power and complex protocols like AMBA5 and PCIe that add additional requirements of abstraction and metadata. These requirements continue to grow as we look at the system-level debug space. Debugging really needs to keep up not only with the design complexity curve, but to keep up with the verification complexity curve, which is steeper.”

The size of designs today are driving the need for emulation, and the demand for more productivity in verification causing growth in adoption of methodologies like OVM/UVM, she offered. “When users adopt emulation they need debugging tools that provide an easy and consistent way to explore their design and find problem whether using simulation or emulation so they do not waste time and productivity moving between the two. Also, because the emulator generates a huge amount of data in a relatively short time, the debugging environment must be specialized to serve up exactly what the user needs on demand. Second, when using languages and concepts such as UVM, which is object-oriented and dynamic, the requirements for debugging are completely different than the requirements of traditional RTL. Because these concepts are interacting with the RTL, the debugger needs to be able to move smoothly between these worlds of testbench and design, and help the user make sense of and quickly find the information that is needed.”

Debug time can be cut in at least two different areas, explained David Larson, director of verification at Synapse Design Automation. “One is you’re debugging your own code. If you think your code is working, now you’re debugging the RTL. That makes our job more challenging because you’re not exactly sure where the problem is. It could be in two different worlds.”

And because verification engineers may be spending 2 to 10 times as long as they should to debug, Larson recommends the following as ways to improve debug productivity:

  1. Increase the visibility of the testbench. “That obviously means use a debugger, for crying out loud,” he said. “I find that at least 70% or more of engineers rely on print statements. They just print because they have the mentality of, ‘I don’t have the time to learn a new tool.’ And they believe that. That’s the number one response I hear when I tell them to learn something new. They are just shooting themselves in the foot. Every simulator comes with a very nice debugger. Use it.”
  2. Use free utilities. “It’s extremely helpful when you are looking at the source code to be able to move around quickly. You might be inside of a function and you’ll want to see where a certain variable is defined. Or you’ll want to see where a function is called. There’s a very good free utility out there called ctags that has been around forever and most editors plug into that utility seamlessly. It basically creates a little file of tags that take you to the area where a certain variable or function or module or class is defined. You might be in a function. You see that this function is calling another function. You want to know what that is. You can just press a certain key sequence in your editor and it takes you straight there. This is a huge productivity booster for debugging,” Larson continued.
  3. Use UVM Objections. “One of the most important features specific to UVM that I find that is also underused is something called Objections,” he said. “Objections, by design, tell you when to end your simulations, but it doesn’t end there, which is the intriguing part. Objections are raised and lowered by any component in your testbench that isn’t happy ending a test. Let’s say you have a driver. The driver receives something to drive. Then it raises an objection to the test ending until it is done sending it. You have various places in your environment that can do this. The reason this is so extremely useful beyond its original intent is that when there is a deadlock in your system — like the test isn’t ending like you think it should — you can print who is objecting, and that will tell you right away who is waiting for something to happen. That will take you very quickly to at least a general area where the true problem is. Another reason why objections are so useful is that it serves the network of another useful UVM feature called the UVM Heartbeat. The UVM Heartbeat is not mentioned anywhere in the user guide and is very under-documented in the reference manual. The UVM Heartbeat watches these objections going up and down and you can program it so that over a certain period of time, if the Heartbeat monitor doesn’t see anything happen, it will kill your simulation. This becomes more and more critical as designs get larger and larger because the simulations take forever — they can take hours and even days — and it’s an enormous waste of time to go back after the simulation is done, after the global watchdog timer finally kicks in. You look at the simulation and only see dead wires after the beginning. The Heartbeat monitor will save lots and lots of time so you can debug it quickly.”
  4. Limit the number of custom interfaces. “On the RTL side, in the SoC architecture when many blocks are custom designed, it isn’t well architected up front in a way to minimize time. There will be many blocks with custom interfaces on them, when they could have defined one, two or three interfaces that they re-use over and over again and still meet their needs. This cuts down both on RTL development time as well as verification time a great deal. “Otherwise the verification team has to develop a driver for each custom interface and that takes a long time to debug and work through,” Larson added.

When it comes to the FPGA-based prototyping perspective on debug and validation, which is a growing part of the verification effort today, Angela Sutton, Synopsys staff product marketing manager for FPGA implementation, said they very conservatively estimate that over 70% of engineering teams doing ASIC designs are using prototyping as one of the vehicles to do validation.

“That really tells us that they are making a significant investment as well as time commitment to debug,” she said. “Another interesting observation on the FPGA side is when I look at the ratio of implementers — the people doing the synthesis and place & route of FPGAs versus people spending time in the lab with the system that’s been created on FPGAs to essentially act as a vehicle for validation of the ASIC —I often will see ratios of 4:1 to 10:1 people in the lab using software to do validation. That really tells me the time investment is significant on validation of ASICs.”

It also means that the cost of bugs is so high that engineers will go to great lengths to avoid them. “I look at the flow of what they’re doing — the people that are doing prototyping are simulating first, then they are emulating to gain early functional debug — they want to get the function of the design working at a relatively low clock speed relative to their final clock speed of their ASIC. Then they are going to their prototype, which lets them do functional debug but with the interfaces running real time at speed. So they are going through all of these different phases of incrementally flushing out functional bugs,” Sutton added.

A multi-player game
In addition, Kishore Karnane, product manager for mixed signal verification and Specman at Cadence noted that with the growing design complexity, users are spending much more time in debug – and drawing upon more resources to do it. “Today, it’s not like one person does all of these tasks. Because of expertise, you have an RTL designer, you have a testbench designer and then you have somebody who has expertise in all of these different levels — like you have one emulation guy, you have another guy who is doing VIP debug, another guy who is doing mixed-signal debug. It’s becoming more of a multi-player game nowadays. You don’t have one person who understands everything because it is so complex that you really need to get some of these guys sitting together and looking at an environment.”

This is where Cadence is trying to come up with an environment where all of those levels can be easily met with one integrated environment, he noted.

More tips for improving debug productivity
Verification engineers really should adopt a methodology for all of this, because debug is not just running simulations. There has to be a level of planning, whether it is a block or a full SoC being debugged.

Ganesh Venkatakrishnan, front end team lead at Open-Silicon, said the problem most engineers have is the simulation times when it comes to debugging huge SoCs. As a design and verification services provider, Open-Silicon follows a step-by-step approach, starting with debugging simulations at the block level, moving onto the subsystem level, and finally the full-chip level simulations. “This approach gives us an advantage—whatever we develop at the block level in terms of the assertions or the testbench components, we plan it in such a way that they are reusable all the way to the full chip level, as well as the subsystem level. That gives us a lot of reusability.”

“For example,” noted Dhananjay Wagh, engineering IP manager at Open-Silicon, “let’s say in an SoC there are various blocks which are talking to each other. The verification engineer really starts stimulus from one pin and he has to get something out at the other chip, but there are various blocks in between that could go wrong. Those blocks are verified individually at the block level, but when they come together there could be misunderstanding or misconnection between those blocks. When you run a test case and it just says ‘fail’ to you, it doesn’t make sense and you have to debug all the way through all those waveforms and simulation hierarchy, which is a pain for a verification engineer.”

Added Wagh: “When we do a block-level verification, we bring some assertions at the block level boundary and we use them at the subsystem level, where two or three blocks are talking to each other, or a full chip where an entire chip is working as a chip. If the test case fails, and if the interface is wrong, that particular assertion or checker will prompt a message saying that it didn’t get the data properly. We can directly pinpoint there and go there and debug the issue instead of going through the hierarchy to find the problem.”

Another issue to plan early, Venkatakrishnan pointed out, is how the testbench is modeled. As part of that, it should have meaningful debug messages to say what state of the simulation the DUT is in, or the testbench is in. The messages should help to point out if there is any information or a warning or a fatal error to exactly pinpoint where the issue is.

“If you plan well and if you’ve done the right things in your testbench, if you really plot a graph between the timelines versus the debug time spent, the initial phase is where the maximum debug goes in,” he said. “That could be close to 60% to 70% of the time. This is where initial bring up is done. Then, when the testbench matures and you have more meaningful messages, you really don’t rely on the EDA tool as such to tell you where the error is. Over a period of time, it actually should drop down to about 20%. You just have to look at the log, and see what could be the issue and then go fix it.”

Like all real productivity gains in verification the key to true productivity in debugging is up front planning, Burns agreed.

“Good verification planning is the key. For debugging, users need to plan for two things,” she said. “First, what tools and technologies will be used so you will know exactly what data will be used and how they will interact and how the results will be debugged. And second, what methodologies and techniques will be deployed. This should be planned and thought out carefully from architecture through emulation/prototype and silicon.”



Leave a Reply


(Note: This name will be displayed publicly)