Debug Becomes A Bigger Problem

EDA companies have been developing more integrated debug flows that bring execution engines and hardware and software closer together, but is that enough?


The EDA industry has invested enormous amounts of time and energy on the verification process, including new languages, new tools, new class libraries, new methodologies. But the one part of the cycle that defines that type of automation is debug. Development teams are spending half of their time in the debug process and the problem is growing.

Part of the reason is that design and debug are expanding beyond being a purely functional problem into other areas such as performance, power and security. In addition, it is no longer just a hardware problem. Debug today has several stakeholders, all of which may want different data, different forms of visualization and different analysis tools.

Expanding execution engines
It used to be that logic simulators were the only tools that had debuggers attached to them, and results could either be viewed interactively or saved into a trace file for off-line visualization and analysis. We have come a long way since then, but each new execution platform has to emulate the logic debuggers that came before them.

“We have added Formal Verification, Emulation, FPGA prototypes and others to simulation,” says Harry Foster, chief verification scientist for Mentor Graphics. “FPGA prototyping is an example. Most reasonable-sized SoCs now use prototyping. Ten years ago the resources available for FPGA debug were really crude such as the ones provided by the FPGA vendors. These were basically small logic analyzer cores. But the problem with those solutions is that the user would have to iterate four to six times just to re-instrument. Each time they run, they would discover additional signals that they had to monitor. Today, we can instrument an order of magnitude more signals. We can handle 64,000 signals and the user selects the subset dynamically without having to recompile the design, which was often an overnight process.”

Designers want consistency and are not prepared to have different debug tools for the same task at different points in the development cycle. “Traditional hardware verification carried out via the venerable HDL simulator will continue to be the choice in the early stages of the design cycle,” says Lauro Rizzatti, a verification expert. “At the IP, block and subsystem levels, system validation, including the interaction of the embedded software with the underlying hardware, is getting too big of a challenge for the HDL simulator. For this task, a designer needs a hardware emulator. It can process billions of cycles in a short period of time. It can boot an operating system and bring the design to a ‘state of interest’ for testing. And it must continue to have simulator-like debugging capabilities that can trace hardware and software malfunctions crossing the two domains.”

And consistency extends more than just look and feel. “When tools mangle signal names it becomes a problem to trace them through and debug,” says Anupam Bakshi, CEO of Agnisys. “A unified environment must understand successive refinements of the specification in all phases of development — virtual prototypes, simulation, emulation, FPGA prototypes and real chips (validation). This helps reduce the load of debug. When the tools are interconnected it should also be possible to do cross-probing.”

Simon Davidmann, chief executive officer for Imperas Inc. is in complete agreement about the need for more consistency as additional domains are added to the debug problem and many execution engines are required. “There are some companies that have switched to using simulation for software validation. A lot of companies are tackling this at the RT-level, using emulation or FPGA prototypes, but this requires that the hardware design is almost finished. Earlier in the development flow, virtual prototypes provide a better, faster platform.”

Expanding abstractions
Bringing in the software domain adds a whole bunch of challenges. “When you link hardware and software, it is adding a whole new dimension,” says Michael Sanie, senior director of marketing in the Verification Group at Synopsys. “For one line of software that you are trying to debug, it is probably associated with hundreds of transactions. Consider the line of software to open a file. That requires a number of read commands at the transaction level, and then each of those is associated with thousands of lines of code and signals of RTL. Immediately you go hardware/software and go across levels of abstraction you are dealing with data that is 10 to the power 5 different.”

The only way to deal with this is by the use of abstraction, and abstraction can be helpful in several ways. “The abstraction of results data can help with understanding and, hence, debugging of errant behavior by helping expose what is wrong at a more conceptual level,” explains Dave Parry, chief operating officer for Oski Technology. “In effect, this allows an engineer to see the forest and not just the trees. Data and result abstraction can also make it easier to compare and contrast behaviors of passing versus failing tests.”

To get all of benefits of system-level debug, hardware and software engineers have to be able to work together on problems and this has traditionally been a problem. It was a long-time industry joke that in many systems companies, the hardware and software teams had to be introduced to each other by an EDA vendor attempting to sell them mixed-domain tools. Those days are behind us, but there are still a lot of differences in the ways that the two teams work and that can make integration more difficult.

Consider a typical software methodology. “If you have a gdb task talking to one processor and another gdb task talking to another processor, it becomes very hard to control the system,” points out Davidmann. “Multiple threads and multiple processors means that concurrency is happening, and that makes it complex. When you have things interacting in parallel you need to have a complete view of what is happening. Consider taking a Verilog description with many always blocks and only being able to view one at a time. That makes it very hard to do anything. This is true of complex embedded systems today. Not only do you want to see what is happening in one processor, you want to be able to see what is happening in other processors and in the hardware.”

But the problems do not stop there. “If the real hardware is involved, then you also have to deal with non-determinism and races,” continues Davidmann. “Every time you run something in software, you don’t know what is going to happen and this makes debug a lot more complex, especially if the problem is caused by sequencing. We still see people trying to use gdb to debug these types of complex things and this is like trying to debug a complex chip using an oscilloscope. First, you can’t see everything that is happening. And second, every time you put a probe on something you change the system, perhaps change the power, and everything starts to go wrong. Software is becoming more complex than hardware, but the debugging of it can be improved by having the kinds of capabilities available to the hardware team.”

Toward an integrated debug platform
The natural solution would be to migrate the software community to a virtual platform where many of these problems disappear and full control, determinism and visibility can be provided. “In the software solutions there is a level of intrusiveness, especially if using JTAG or Vstream to connect the debugger to the core,” points out Frank Schirrmeister, senior director of product marketing for the System and Software Realization Group at Cadence. “When you connect those to virtualized offerings, such as simulation, emulation, FPGA prototypes, all of those problems go away. In the virtualized solution we can query things without changing the state.”

While some may still be unsatisfied with the operating speeds of the virtual engines, there are other ways in which debug could be performed. “Which users are you looking at?” asks Schirrmeister. “Is it the software guy trying to look into hardware, or the hardware guy trying to deal with software because it executes on his hardware? For the bottom-up guy, you can select the core whose image you want to look at while you are poking in the hardware, and a lot of it can be done in a post-processing form. It is possible to generate a trace from an emulation run that provides all the time-stamped traces. We have all of the hardware traces. Now you can look into each of the cores and understand where they are in the pipeline.”

Post-processing also can be useful for performance analysis, especially for problems that cannot be found from a single run. “Our design environment has a performance analysis database where we obtain transactional data from RTL or SystemC simulations and load that into a SQL database,” explains Drew Wingard, chief technology officer at Sonics. “From that we can extract the underlying transactions. The system can link the original transaction from the initiator, tie that to the views of the transaction as it flows across the network and again for the transactions on the way back for the response. We can show this in graphical ways so they can visualize the causal relationship between what happened at various places. If something is not getting the performance expected, you can investigate where the choke point might be.”

A lot of effort has been made to bring these two domains closer together, such as integrating a hardware debugger with the popular software environment, Eclipse. “When the simulation time cursor moves in the waveform, the corresponding software statement can be highlighted in the software view,” says Sanie. “Conversely, if a user is stepping in the software view, the simulation time in the hardware views will be automatically synchronized. Users can freely move forward or backward in the whole simulation time range. Users can also set up the breakpoints to quickly jump to interested points. Verdi provides a mechanism that automatically establishes the correlation between waveform and executed software instructions, and allows programmers to simultaneously view multiple cores.”

The hardware world increasingly is using transaction-level execution and debug for performance, power and other architectural considerations, as well as for performing functional verification after multiple IP cores have been integrated into a system. “Migration of the verification world to the transaction-level helps a lot with that task,” says Zibi Zalewski, general manager of the hardware division of Aldec. “Working at the message-level is also more natural for software developers and simplifies the cooperation with the hardware teams. It is a big thing to have common data formats understandable by both teams. It also allows you to insert transaction-based protocol monitors and checkers into the verification platform. This not only helps with debugging, but also automates the process of detecting and narrowing down the problems. Message-level debugging is definitely one of the most common debugging tasks for modern SoC projects.”

Agnisys’ Bakshi reminds us that the development process itself can bring surprises that need to be understood by the users. “One need for debug is due to inaccurate successive refinement steps in the design process. Typically, a product design team starts off with a design specification, and that specification is transformed into an IP or a chip. When the transformation breaks down, engineers need to track the points at which there was an incorrect transformation.”

Context sensitivity
Verification IP has an important role to play in debug. “If you are looking at a protocol, it has to be context sensitive,” says Sanie. “USB has a whole different set of packets and interfaces. Having that information embedded into the debug environment so that your users do not have to become interface experts is very valuable. Context sensitivity is very important.”

Assertions also can be a valuable part of IP integration. “You can have assertions that get triggered if you are not using the IP in the right way,” continues Sanie. “This enables you to flag things up front. If you go back six months later and try and look at waveforms, it is a lot more difficult. You want to find stuff as quickly as possible.”

Having context sensitivity for protocols is not enough for all tasks. “When you are trying to debug things happening at the system level you don’t want to be looking down at the details—even though it might help you some of the time, such as when you are having certain issues within a driver,” says Davidmann. “But to debug software you want to be at the highest level of abstraction possible. You want to be working at the OS level and the application level so that you can see how they are all interacting. You need the relevant technologies to be built into the debugger, which knows that there is an OS, and to be able to present that level of abstraction to the user. Then they can ask questions such as, ‘Which functions are using the most power?’”

  • Gajinder Panesar

    I understand here you are primarily focussing on tools, but people should recognise the whole system(SoC and software) is a very complex beast and as such it is almost impossible to truly understand how it will operate in deployment. As such they should start designing SoCs that can be debugged more efficiently. This can then supply better information to the tools you describe: today, it doesn’t matter how good your tools is if it has no visibility of the system you want to debug.

    That will mean adding on-chip hardware: but that will not only give you benefits during debug, but also optimisation in the field too.

    • Brian Bailey

      I agree with you, but several companies have attempted to create IP to do exactly that and have not been successful thus far. The question is: how much insurance are you willing to pay to be able to find out what goes wrong in the field?

      • Gajinder Panesar

        There is a company in the UK called UltraSoC Technologies Ltd ( that provides IP to do just the sort of things I highlighted. I just happen to be the CTO.
        I don’t think there is a one number that will answer your question. But if the IP is non-intrusive (as ours is) and can be effectively switched off when not needed, it comes down to how much silicon area it takes. For modern SoCs the silicon cost associated with this IP is (almost) insignificant compared to the overall SoC.