中文 English

Do You Know For Sure Your RISC-V RTL Doesn’t Contain Any Surprises?

Formally verifying that a RISC-V ISA is free from gaps and inconsistencies.

popularity

Given the relative novelty and complexity of RISC-V RTL designs, whether you are buying a commercially supported core or downloading a popular open-source offering, there is the small but non-zero risk of unwanted surprises escaping undetected into your end-product. In order of high-to-low probability, consider:

  • The presence of a weird-yet-entirely-possible corner-case bug
  • Bugs “inside” the custom instructions that you or your vendor are creating for your application
  • Bugs “at the edges” of a custom instruction — e.g., the instruction executes correctly, but it somehow leaves the machine in a corrupted state
  • Related: undocumented and/or poorly specified new features that unwittingly open up holes in the design
  • Malicious Trojan logic surreptitiously inserted via a supply chain attack

Common mitigation approaches

Naturally, the first line of defense is expert inspection of the incoming or developing RTL code. Obviously, this should be done; but it should be equally obvious that this technique — as they say in the mathematical world — “is necessary but not sufficient” (even if you had Professor Patterson himself review your code).

The next line of defense is applying simulation-based approaches: Instruction Set Simulation (ISS), automated comparison of your DUT to mature golden models, constrained-random UVM testbenches for DUT RTL simulation, and even piping real world stimulus into hardware-assisted emulation of the DUT. Again, these approaches are all valuable and should be done; but they all suffer from the same flaw: they are inherently incomplete as they rely on stimulus generation. For example, in the extreme-but possible case of a supply chain attack, the Trojan logic developer has deliberately created a triggering sequence of signaling and data that will likely escape detection by even the most creative white hat hacker. And of course, functional bugs have their own way of using naturally occurring entropy to remain hidden.

The bottom line is that the only way to be completely sure that your RISC-V RTL is free of any natural or malicious surprises is to apply exhaustive, formal methods to verify the design.

Easier said than done, right?

Yes and no: 14 years ago, this sort of analysis was only doable by PhD-level researchers using their own handmade programs. But since 2008, the tools and techniques have been productized such that anyone who is familiar with the basics of formal verification and the writing of IEEE standard System Verilog Assertions (SVA) can rapidly apply and succeed here.

A three stage formal-based process

The recommended formal-based flow unfolds as follows:

  1. Create a formal testbench that “models” the DUT specification
  2. Define the input constraints and checks to be run against the DUT
  3. Execute analysis

Step 1 – Create a formal testbench that “models” the DUT specification

The foundation of this methodology is to write a set of properties that represent the behavior of each instruction in your RISC-V design. The task is to capture the effect of a given instruction on the IP’s outputs and internal architectural states (in the RISC-V world, this is the program counter (PC) and register file (RF)) for any given arbitrarily long input sequence. This is done using a purpose-built extension to IEEE SVA called Operational SVA. In a nutshell, this is a library that’s shipped with the formal-based processor verification tool; and from the verification engineer’s point-of-view, it looks like an intuitive subset of familiar SVA code. Figure 1 shows the generic structure:

property instruction_A;
	// conceptual state
	t ##0 ready_to_issue() and

	// trigger
	t ##0 opcode==instr_A_opc

implies
	// conceptual state
	t ##0 ready_to_issue() and

	// memory interfaces outputs
	// read next instruction
	t ##0 imem_access(instr_A_opc) and
	// data load/store
	t ##1 dmem_access(instr_A_opc) and

	// architectural registers
	t ##1 RF_update(instr_A_opc) and
	t ##1 PC_update(instr_A_opc)
endproperty

Fig. 1: Structure of Operational SVA code capturing the specification of a processor instruction. [1]

Referring to figure 1, the left-hand side of the implication (the portion of code above the keyword implies) identifies when the machine is ready to issue a new instruction, and the instruction being issued. The assertion captures the current values of the architectural states and, in the right-hand side of the implication (portion of code below the keyword implies), proves their next values.

Additionally, the processor outputs are to be proven correct — in this case to make sure that the instruction accesses the expected data and instruction memory locations. The assertion also proves that the machine is ready to issue a new instruction on the next cycle. This is crucial to decouple the verification of an instruction from the sequence of instructions of which it could be a part. For example, instruction A could execute correctly, but leave the machine in a corrupted state. This erroneous state could cause the next instruction B to produce incorrect results through no fault of its own. Hence, with the Operational SVA approach, the assertion verifying instruction B would pass, while the assertion verifying instruction A would fail (where read and write memory operations could last several cycles).

Bottom-line: the functions representing architectural states in the Operational SVA code must be mapped to the implementation signals and must reflect the microarchitecture of the processor. The status of the RF, for example, might depend on the activation of forwarding paths. [1]

[Note: For those of you familiar with either simulation or formal functional coverage, this notion of completeness does not rely on structural coverage metrics, or on creating and collecting metrics for a functional coverage model. Instead (and getting a little ahead of the remaining steps and results), the analysis output here is all about getting full proofs for all the properties. Full proofs also implicitly show that there is no other functionality — that was unspecified or unexpected (spec or coding bug), or maliciously inserted — that is not captured by this set of properties. Rephrasing, this methodology is all about achieving 100% “test plan coverage” as verified by the exhaustive formal analysis results — where nothing is more “complete” than a mathematical proof!]

Step 2 – Define the input constraints and checks to be run against the DUT

To complement the specification properties for each instruction, the next step is to define the input constraints and any additional output checks. Again, Operational SVA is employed, where the user now specifies a “completeness plan” to define both legal inputs and illegal signaling that the DUT should be ignoring. Per the example shown in figure 2, there are three sections: determination assumptions, determination requirements, and property graph.

completeness processor;

determination_assumptions:
	// INPUTS
	determined(imem_data_i);
	determined(dmem_valid_i);
	if (dmem_valid_i)
		determined(dmem_data_i)
	endif;

determination_requirements:
	// OUTPUTS
	determined(imem_read_o),
	if (imem_read_o)
		determined(imem_addr_o)
	endif;

	determined(dmem_enable_o);
	if (dmem_enable_o)
		determined(dmem_rd_wr_o),
		determined(dmem_addr_o)
	endif;

	// ARCHITECTURAL STATES
	determined(PC);
	determined(RF);

property_graph:
	all_instructions :=
	instruction_not_a, instruction_add_a,
	instruction_sub_a, [...]

	all_instructions -> all_instructions;
end completeness;

Fig. 2: Example of a completeness plan with conditional determination assumptions and requirements.

To elaborate:

  • “determination_assumptions” is simply a fancy name for input value constraints
  • “determination_requirements” is a definition of the signals that must be verified (both the processor’s output signals and architectural states)
  • The “property_graph” section is simply a binding of this file to all the properties created in Step 1

Recapping where we are at this point: in Step 1 you created what is effectively a cycle-accurate model of the DUT that must be proven true for all time and all inputs; in Step 2 you set up the constraints and any special behaviors to look out for. Add these together, and in effect you have a formal testbench that’s ready to run!

Step 3 – Execute analysis

The goal of all formal tools is to exhaustively prove that all properties are true for all time and all inputs. In the case of RISC-V processor verification, the tool works to prove that any arbitrarily long input sequence can be matched to a unique sequence of the specified Operational SVA that predicts the values of outputs and architectural states.

And this is exactly what happens. If there is any difference in behavior detected between the specification and DUT, the formal tool delivers a “counter example” waveform showing exactly the series of input signals and data that can create a violation of the spec. As alluded to above, such failures can be found in the interior of instruction’s RTL logic or in the “handoff logic” that tees up the next legal branch/instruction.

Either way, when these issues are fixed and the tool proves all the properties, the results are truly “complete” — i.e., you can be mathematically certain that there are no RTL coding errors — the formal analysis has literally proven the absence of any bugs!

Results

First, as noted above, over the years many processor developers have benefited from this flow [2], [3], [4].

Putting this methodology to the test with RISC-V, my colleagues did a case study using the popular Rocket Chip open-source core. Specifically, the RV64IMAFDC – sv39 vm configuration was examined. This is a 64-bit processor core with a 39-bit virtual memory system [5] and extensions, such as compressed and atomic instructions [6]. This open-source, RISC-V ISA implementation uses a five-stage, single-issue, in-order pipeline with out-of-order completion for long-latency instructions, such as division or cache misses. The core also supports branch prediction and runtime modification of the “misa” register, allowing it to switch off certain instruction set extensions.

Although this snapshot of the Rocket Chip had been extensively verified and taped out multiple times, four previously unknown, corner-case, suspicious behaviors were identified and reported to the Rocket Core RTL developers. These issues ([7], [8], [9], and [10]) were discovered solely through the systematic application of the complete formal verification approach outlined in this article.

Elaborating on [10] specifically — the discovery of a non-standard instruction CEASE (Opcode 0x30500073) in the RTL: clearly the Rocket Chip team fell behind on their documentation (and they fixed this almost immediately upon filing of the GitHub pull request). The larger point is that this demonstrates that the logic required to implement an entire instruction — dozens of lines of code and many gates of synthesized, placed, and routed logic — can escape detection by visual inspection, RTL simulation, gate level simulation, the whole back-end implementation process, and hardware prototypes in the lab!

But what about the performance of this flow?

First, let’s address the larger meaning of “performance.” Due to the novel nature of the Rocket Chip design, for this case study it took roughly 20 engineering-weeks for our formal verification practitioners to develop the whole testbench and constraints. However, prior applications of this flow on more structured, commercial IP typically have taken a fraction of this time. Naturally, the whole bring-up process will go a lot faster the more stable and mature the specs are, how well documented and readable the DUT RTL code is, and how much access you have to the design engineers for Q&A.

Once the environment was set up, the wall-clock runtime was all of 2 hours — i.e., a lot less than you could reasonably expect from constrained-random RTL simulation and even hardware-assisted verification. Plus, recall that the results of this analysis are valid for all inputs and all time — in a word, they are exhaustive [11].

Summary

The complete, formal-based processor verification approach presented in this article uses an extension to IEEE SVA, Operational SVA, to formally verify that a RISC-V ISA is free from gaps and inconsistencies. Unlike constrained-random simulation testbenches, emulation, or physical prototyping, the complete set of properties and constraints exhaustively detects many types of RTL errors, as well as the presence of undocumented or poorly specified code and malicious Trojans.

References

1 – 2019 GOMACTech Conference, Albuquerque, NM, March 28, 2019: Complete Formal Verification of RISC-V Processor IPs for Trojan-Free Trusted ICs, David Landoll, et.al.

2 – DVCon 2007: Complete Formal Verification of TriCore2 and Other Processors, Infineon Gmbh.

3 – 51st Design Automation Conference (DAC): Formal Verification Applied to the Renesas MCU Design Platform Using the OneSpin Tools

4 – DVCon Europe 2019: Complete Formal Verification of a Family of Automotive DSPs, Bosch Gmbh.

5 – The RISC-V Instruction Set Manual, Volume II: Privileged Architecture, Document Version 1.10.

6 – https://github.com/freechipsproject/rocket-chip [accessed December 20th, 2018]

7 – DIV Instruction Result not Written to Register File
https://github.com/freechipsproject/rocket-chip/issues/1752

8 – JAL and JALR Instructions Store Different Return PC
https://github.com/freechipsproject/rocket-chip/issues/1757

9 – Illegal Opcodes Replayed and Causing Unexpected Side Effects
https://github.com/freechipsproject/rocket-chip/issues/1861

10 – Non-standard Instruction CEASE (Opcode 0x30500073) Discovered in RTL, https://github.com/freechipsproject/rocket-chip/issues/1868

11 – Verification Horizons blog, How Can You Say That Formal Verification Is Exhaustive?, Joe Hupcey III
https://blogs.sw.siemens.com/verificationhorizons/2021/09/16/how-can-you-say-that-formal-verification-is-exhaustive/



Leave a Reply


(Note: This name will be displayed publicly)