Finding out if a processor implementation matches the specification is important, but conformance testing is currently not available.
Experts At The Table: Despite growing excitement and participation in the development of the RISC-V ecosystem, significant holes remain in the development flow. One of the most concerning is conformance, which must exist before software portability becomes possible. Semiconductor Engineering discussed the issue with John Min, vice president of customer service at Arteris; Zdeněk Přikryl, CTO of Codasip; Neil Hand, director of marketing at Siemens EDA (at the time of this discussion); Frank Schirrmeister, executive director for strategic programs and systems solutions at Synopsys; Ashish Darbari, CEO of Axiomise; and Dave Kelf, CEO of Breker Verification. What follows are excerpts of that discussion.
L-R: Arteris’ Min; Siemens’ Hand; Codasip’s Přikryl; Synopsys’ Schirrmeister; Axiomise’s Darbari; Breker’s Kelf.
SE: In the context of RISC-V cores, what do conformance, verification, and validation mean?
Kelf: I’m on the Certification Committee for RISC-V International, and even that committee wasn’t initially aligned on these definitions. We spent several meetings trying to figure that out. You’ve got conformance to the ISA. Is everything compatible with the ISA itself? What the certification committee really wants to ensure is that these processors are working properly. Not the verification of the micro-architecture, but architecturally, do they perform like any other processor and meet what is expected from the ISA specs? That would be certification. Verification is making sure the micro-architecture works correctly in all circumstances. That’s about as far as they can get.
Darbari: Conformance means the RISC-V design has implemented the RISC-V ISA correctly. The processor can have custom instructions that are not part of the ISA, but the published ISA’s instructions must have the precise semantics in the implementation, and must match with the published ISA. For example, a LOAD instruction is meant to load the data from memory, not to move the data from another register. Validation is about whether we are building the right design. Validation is not limited just to conformance alone. You could be conformant to the ISA and yet build a processor with performance, power, and area issues, which were not the intent originally. Verification is about ensuring that whatever requirements/specifications were agreed upon are met by the implementation i.e., there are no mismatches or bugs. For example, any power optimizations introduced must not break it. As regards certification, I’d think of it as a process where a neutral organization independently checks that all instructions in the ISA are implemented faithfully, and that no custom instruction breaks the functionality of the ISA instruction under any circumstances. This only can be done through rigorous checking possible by formal methods, as there could be millions of test cases where ISA instructions continue to work correctly, and under some corner cases they don’t.
Schirrmeister: You need to bear in mind what’s in it for the user. When you investigate other processor architectures, RISC-V stands alone. You can extend it and do your own thing, but then you’re out of luck because you have to verify all the changes you made. Within the RISC-V definition, there are things called profiles. These profiles are variations that the RISC-V organization has defined and approved. You want to make sure that you comply with these pre-defined profiles when you do RISC-V because the main benefit is on the software side. Certification, by itself, doesn’t really mean anything unless you need lots of plaques for your walls. It’s really about the software having a chance to be ported easily onto it. You need good reference models that you can compare against. With that, you have the ability to verify against it. It avoids fragmentation where everybody has their own port. If you compare this with other architectures, there’s always a question, ‘Is it ready for the software to use?’ If you look at other processor architectures, they have this thing called System Readiness where you essentially specify things in the way it can serve interrupts, and it can deal with those situations for a specific profile. That is really what you want to compare against. And for standards, you need strong verification.
Min: CPU software has two big families. There’s a lot of RISC-V used as embedded CPU, which means chip designers control the software. Software portability doesn’t matter because the software is written for specific hardware. As the CPU gets adopted more widely, we have third party software vendors writing software that assumes it is independent, and a hardware abstraction layer, assuming things will be compatible. As RISC-V grows, unfortunately or fortunately, they built in expendability. The profiles are trying to bring that together, where the third parties can see the hardware differently.
Přikryl: It is a huge achievement that you actually agreed about what compliance to the ISA means, what certification means, and verification means, because it’s really hard to agree on that. Assuming that it’s done, this is a huge step. On the certification side, if you look only at architecture compliance, even that is quite challenging, because you need to come out with tests that will not only test sequences of instructions, but also things like going into supervisor mode, hypervisor mode, checking that it works correctly. And then if your core gets certification, you can run an OS. It is going to work because you pass all of these architectural tests. Verification goes way beyond that. The customization story of RISC-V is very compelling, and we need to find a way to deal with verification. We recently announced something that we call boundary customization. What that means is that you have the CPU, a profile for it, and if you do customization within the sandbox, then you guarantee that you can’t break the profile. It is not just about adding simple instructions. You can do many things within certain boundaries that we painted, and then all you need to focus on is your incremental stuff. You can guarantee compatibility with software, and you can benefit from the customization, assuming that you recompile for that profile.
Hand: Part of the reason you started off by saying there’s no clear definition, and part of the beauty of RISC-V, is that it means different things to different people. The concept of compliance is different than verification. We do different things to catch it. It is about the software. If you want to use the software ecosystem, you have to make sure you are compliant with the ISA, whether it be through formal methods, through simulation methods, or whatever methods you want to use. Conformance becomes critical. But if you’re going in a deeply embedded system and using this to replace a custom processor, you need to be compliant within your environment. It means a different thing. And verification may mean, ‘Am I verifying within my system? Am I verifying that the core works correctly? What is the scope of that verification? What have I changed? Do I want to change it?’ If I change it, I’ve got to verify everything, as opposed to saying, ‘I want to do something in the SoC as an accelerator and leave the core separate.’ One of the reasons that agreeing on the terms is difficult is because, depending on what your aims are, it means doing a different set of things. The beauty of the RISC-V ecosystem is you can still work within that environment and say, I want to pick that piece and utilize that without having to reinvent everything.
Schirrmeister: It’s important for the purpose of this discussion to highlight the difference between certification, which I look at as comparing against something that is commonly agreed on, and verification. In the Venn diagram, certification is the common base. Verification is a wider issue and must cover everything, even beyond that commonly agreed space. If you add custom instructions, you have these high-quality models that you modify, and you add your additional instructions to it. The thing that is cool for the EDA landscape is that with RISC-V everybody needs to run lots of cycles to verify that it conforms and is fully verified. That’s where our interest comes in. There are different levels of scope that you need to verify. You need to verify the ISA itself. You need to verify the instructions you added. You need to bring in the heavy boxes, because you want to run lots of software, lots of cycles. You want to specifically identify the items that need to be tested. Test generation becomes very important. They need to run it all — simulation, emulation, prototyping. They’re all necessary.
Hand: You forgot formal.
Schirrmeister: Basically, all the standard technologies. But now, because everybody has this freedom to innovate, they have the duty to verify the things they changed. That’s where EDA comes in, but that verification bit is different. There is overlap and they touch each other, but it’s different from certification.
Darbari: I have a lovely example here. The 0riscy processor was already in silicon. We did some verification on it, and it turns out that the LOAD implementation was not the usual RISC-V ISA LOAD. What they had implemented was a MOVE instruction. This happened because of an accidental copy-paste error in the RTL. When the PULP team created the 0riscy from RISCY, this instruction was left by mistake. I can bet standard test-based simulation won’t catch these issues easily — or at all in some cases.
SE: Why do we attack conformance as a verification problem? What we are trying to define is that each instruction is meant to perform a rigidly defined set of actions and nothing else. That there are no side effects of performing that instruction. Why are we not looking at this as, ‘Can we show that your implementation does these things and nothing else?’
Darbari: Absolutely. Formal is not just for verification. We routinely use it to check for validation issues, including bugs in protocol specifications — and including one major one found in the RISC-V ISA itself. In the published RISC-V ISA specification v2.2, on page 30, it requires the generation of an illegal instruction exception for shift instructions SLLI, SRLI and SRAI if imm[5] is not equal to 0. However, it is not possible for imm[5] to be non-zero if the instruction decoded is one of the SLLI, SRLI, and SRAI as the specification provides the opcodes for these on page 104. One great advantage of using formal is that the stimulus is virtually infinite, provided you don’t over-constrain, and this allows the exhaustive time-based checking to explore those scenarios that are not meant to be — meaning we can prove that any unintended implementation artifacts do not break the intended ones. As formal tools are fast at this analysis, finding such mismatches is a walk in the park.
Hand: It is not that simple. You can do formal compliance towards the ISA standard. Codasip has worked with their formal team. They’ve done it. Common customers have worked with it. But ultimately, does it matter if it’s compliant if it doesn’t do what you want it to do? It comes back to the difference between verification and validation. Verification is, ‘Does it do what I intended it to do?’ Whereas validation is, ‘Does it do what it was supposed to do?’ You still need to run verification. You still need validation to make sure, in your system context, it is doing the right things. You are right, you can and should use formal if you’ve made any changes to the processor. You’ve got your extensions in there. You want to make sure you haven’t messed with anything else because it is very easy, when you add a new memory addressing mode, and find you’ve actually blown up the whole standard. But that doesn’t replace other forms of verification. People still verify with IP cores that they buy. They buy the IP core with the belief it is a known good core. You still run verification. What formal can do is treat my modified processor as known good IP. There isn’t one solution for anything. You want to use a mix of formal and simulation to make sure your processor is correct. You want to be able to use the big boxes to make sure you’re running software on there. You want to be able to use virtual prototypes early to make the decision about whether to run this particular set of workloads on it, or to benchmark different processor vendors. It is all of these working together.
Schirrmeister: This is not about formal tools checking the results of synthesis, like in logic design where you use a logic equivalence checker. That validates the steps you used in logic synthesis were correct. It is not that simple when adding instructions and figuring out whether I implement it correctly in RTL, because you need to validate that the processor states are correct, that it interacts with the system correctly. It’s probably a matter of perspective, too, from the processor core by itself. Probably I could follow your definition a little bit, that it is validation within the system context. But from a user perspective, it’s a verification of the core, plus the things they built around it.
Related Reading
RISC-V Heralds New Era Of Cooperation
Collective risk taking and pooling of knowledge are only possible with a framework for collaboration. But will it continue?
Formal Verification’s Usefulness Widens
Demand for IC reliability pushes formal into new applications, where complex interactions and security risks are difficult to solve with other tools.
Leave a Reply