Knowledge Center
Knowledge Center




Trace is a hardware debug feature that allows the run-time behavior of IP to be monitored. More specifically, processor-trace functionality is a hardware real-time monitor that non-intrusively captures events in the CPU, sends it out to an external device where it will be saved, and ultimately reconstructed into human-readable form.

Trace technology isn’t new. It has been in use since the late 1980s, when the microprocessor industry really began ramping. Trace came about because people wanted to be able to see in real-time what was happening in the processor.

In trace, data comes off of the chip while the processor is running, and chip engineers use tools to analyze that data in real-time to understand what’s happening in the system. Within that, people want to see tracing instructions that ran in the processor, data transactions, and the ID of the task that happened to be running at that time, along with the program counter — all in real-time. Processor trace provides a detailed history of program execution, which is useful for debugging and performance optimization, and often has no performance impact on the program being traced. Processor trace functionality provides visibility of program execution. It enables the users to determine the exact set of instructions executed by the processor, which can then be analyzed to get to the root of the failure.

The problem is that trace has not kept pace with the ongoing explosion of customized processors being developed for the cloud, edge and AI-related design. In fact, when it was first introduced, processor debug was known as static debug. Today developers need to be able to trace interactions across multiple and often heterogeneous processing elements that may function independently of each other.

The IEEE Nexus 5001 real-time trace standard was introduced to add some consistency into this process, giving the tool vendors a standard interface that they can talk to, so they’re not trying to build custom tools for every single processor implementation.

Arm’s CoreSight has become a de facto standard given the proliferation of Arm cores. Coresight has a Trace Port Interface Unit (TPIU), which acts as a bridge between the on-chip trace data, with separate IDs, to a data stream, encapsulating IDs where required, that is then captured by a Trace Port Analyzer (TPA).1

Trace can be used in performance profiling tools such as AutoFDO, which uses processor trace captured in main memory specifically to capture traces for analysis on the device without interrupting program execution too frequently. In small microcontrollers, where main memory is limited, export off-chip is often preferable.