Corner-Case Bug Hunting for RISC-V

Nice to have, or an imperative?

popularity

By Ashish Darbari and Ia Tsomaia

RISC-V continues to make headlines worldwide, but verification continues to be challenging. The findings of the Wilson Research Report, 2022 (see figure 1) make the trends in verification clear. We presented these in a keynote talk titled, “Future is Formal,” at the recent DVCon India event.

One thing is quite apparent: whether you are using directed tests or constrained random, the world is still obsessed with test-based verification. While there is definitely a strong need for constrained random as well as directed testing, it is also clear that they are both heavy on stimulus-based verification, where a human or now, generative-AI-assisted tools have to design the test sequences to explore the deep states in the design.


Fig. 1: Wilson Research Report 2022, with findings showing ferification is in a crisis.

Bread-and-Butter Case Verification
In most test-based environments that use dynamic simulation, the aim is to start throwing in some tests to evaluate if the underlying DUT works at all. But the shift to exhaustive bug hunting has been known to be impossible for simulation, at least for anything that is moderately complex.

So how do we guarantee that the simulation has not missed any corner-case bugs? The answer lies in coverage.

Whereas functional coverage along with other structural coverage metrics (code, toggle, expression and branch) are used to decide the final outcome of verification collectively, none of these is scaling up as design complexity grows.

The standard bread-and-butter test cases for simulation-based verification are inadequate to ascertain that the silicon can be shipped free from bugs. In fact, plenty of time is wasted doing bread-and-butter testing that could be better utilised. Corner case bug hunting is not natural for directed testing.

While constrained random continues to be used extensively, it fails to find all the bugs, never mind the corner case bugs. This happens because of the heavy reliance on stimulus sequences, their timing, and randomisation. The result is that even the simplest designs that could be verified exhaustively with formal methods in hours continue to be shipped with bugs, having not been caught by simulation before.

Formal Verification
At the heart of formal methods-based verification is a requirements-driven approach that relies on formulating the requirements as formal properties (assertions/covers/assumptions). Once the requirements have been formulated in a property specification language such as SVA or PSL, the properties are run in a formal tool, which then builds proof that the requirements hold against the design. If the formal tool cannot build a proof, it will show a failure as a counter-example that can be debugged – all without needing any manual stimulus.

Formal verification is verification with unbounded stimulus. Although formal verification does have a state-space explosion problem as it aims to be exhaustive every time, when proper methodology is used, the results are staggering. In our experience of verifying the RISC-V designs both in-order and out-of-order, we are yet to encounter any issues with proof convergence.

Formal Verification for RISC-V: Going beyond Functional Verification
At Axiomise, we built an automated formal verification solution in 2019 and showed how it was used to verify 0riscy, ibex and subsequently other processors such as cv32e40p, CVA6 and the WARP-V family of processors.

Figure 2 shows the results obtained on a range of these cores and all the results of reporting bugs have been filed on Github and are accepted as bugs by the designers.


Fig. 2: Summary of processors verified by the Axiomise formalISA app.

While most of the time is spent on the functional side in most verification tasks, we presented talks at the RISC-V summit in 2019 on showing how lockstep verification could be done quickly through formal methods. We also discussed security verification in DAC 2020, when we showed how the CIA approach to security coupled with vulnerability analysis measurement could identify a metric-oriented view of assessing security.

Over the years, we have been applying our formalISA app for several projects, including ten open-source projects. We have found that corner-cases have been picked up by our formal properties in no time and in many cases within seconds of run time, not even minutes. What we find so exciting about this exercise is that while most, if not all, the processors have been previously verified by simulation, we can always pick bugs with our formal solutions that have been missed before.

Formal Verification for Deadlocks
When we verify a RISC-V core with AXI bus implementations, we often encounter issues with the AXI implementations. Below is an example of a deadlock (or starvation) issue found in the PULP platform implementation of the AXI protocol. What happens is that if the awvalid and awready channels are busy, then the AXI arvalid signal can wait forever for the arready signal. This bug was picked up when exercising our AXI ABVIP on a RISC-V core.


Fig. 3: If the awvalid and awready channels are busy, the AXI AR valid can wait forever for its AR ready signal.

Formal Verification for Security
Talking about security, we got interested in the cheriot-ibex processor. We didn’t know a lot about CHERI initially. We got excited hearing about it at the RISC-V summit Europe, 2024. After spending a week reading up on CHERI, we decided to look at cheriot-ibex. We integrated it into the app within a few days, and we immediately encountered quite a few interesting issues that we reported to the cheriot-ibex team.

One of the first ones we picked up shows the strength of formal and why we need formal as our first point of verification.

CHERI specification advocates for security first. A CHERI implementation of RISC-V ISA provides extra guarantees and protection from memory leaks because the CHERI instructions provide those safeguards. However, the challenges remain in ensuring verification quality because we need to verify that the CHERI instruction implementation is not only not broken but also doesn’t introduce functional, safety or security issues. A lot of the verification work, therefore, must focus on ensuring not just compliance with the RISC-V ISA or CHERI specifications but also ensuring that all implementation artefacts preserve security as well as functionality. At first, we were surprised to have picked this bug in the first place, as we knew that this processor has been previously verified by simulation and formal.

The case we found is fascinating. We see a case that an illegal CHERI LOAD instruction is issued in the pipe, it does not get squashed and is sent for execution to the memory, from where it picks up a data value that ends up overwriting the execution of the RTYPE instruction.

Figure 4 shows the problem. We believe there are multiple issues here. (a) the illegal instruction must raise an exception and must be thrown out of the pipe and not sent to memory (b) Even if we waste pipeline cycles compromising performance and burning power executing an illegal instruction; it must not interfere with the regular functionality of the processor’s other instructions which it does. The problem is not just that it is impacting the RTYPE instructions, but it gets compounded as STORE instructions would also not work correctly, corrupting the memory, picking up values from the register file that they must not.


Figure 4: The bug caught with formalISA in cheriot-ibex

Now, as it happens, the designers picked a separate issue that they suspected may be connected to the issue we spotted. They responded to us, saying that a fix is available and we can test it.

Why simulation will find it difficult to catch such bugs?
The reason this kind of bug is difficult to catch with simulation is because the number of cases where an illegal instruction is meant to raise an exception and needs to be trapped can get out of hand quickly when you think of what impact it needs to have with regular instructions in the pipe. The sheer number of permutations and combinations are not easy to catch with simulation – something trivial with formal methods.

Issue truly fixed or not?
Upon downloading the fixed RTL, we found that this functionality has indeed been fixed; however, we also found that the properties checking the bit manipulation instructions previously proved in formal now fail. So, whereas the fix provided by the designers did seem to fix this issue, the designs are broken somewhere else – a classic artefact of how formal finds bugs.

Before the design fix, we locally constrained the environment in formal to block the issue of an illegal CHERI load, as we suspected that to be an issue. Upon blocking this instruction, the bit manipulation checks proved exhaustively in formal. However, once the new design with the CHERI load issue fixed was exercised, the bit manipulation properties failed, exposing additional issues with the design fix.

formalISA: Bread-and-Butter Exhaustive Formal Verification for RISC-V
The best part of finding this bug for us was that this was a standard RTYPE assertion check that is part of the formalISA app that failed in Cadence JasperGold in 20 seconds! We didn’t have to do anything special to find this bug other than design the formal testbench upfront, which is automated and configurable as part of the formalISA app.

Conclusion
As formal experts, we have been deploying formal in the field for the last seven years. The Axiomise team has verified over 150 designs with formal, including server-grade CPUs, GPUs, NoCs, video IP components, caches, cache controllers, memory controllers, power controllers, interrupt controllers and so on.

We believe that verification should go beyond the bread-and-butter case test-based verification and formal methods would be a great way to not only establish sanity checking in the first hour of design bring up but also in finding corner-case bugs really quickly and then mathematically proving that the fix works.

RISC-V-based silicon will soon hit volumes, and then these corner-cases will show up in silicon much more quickly — something designers don’t necessarily see in prototype implementations. We strongly believe that corner-case bug hunting is imperative and can be very well addressed with formal methods.

— Ia Tsomaia is a Masters student in embedded computing systems at the University of Southampton.



Leave a Reply


(Note: This name will be displayed publicly)