The verification of a processor is a lot more complex than a comparably-sized ASIC, and RISC-V processors take this to another layer of complexity.
Semiconductor Engineering sat down to discuss the verification of RISC-V processors with Pete Hardee, group director for product management at Cadence; Mike Eftimakis, vice president for strategy and ecosystem at Codasip; Simon Davidmann, founder and CEO of Imperas Software; Sven Beyer, program manager for processor verification at Siemens EDA; Kiran Vittal, senior director of alliances partner marketing at Synopsys; Dave Kelf, CEO of Breker Verification, and Hieu Tran, president and CTO of Viosoft Corporation. What follows are excerpts of that conversation.
SE: What makes the verification of RISC-V processors different from any other ASIC?
Kelf: Processor verification is a whole new world. We know that Arm and Intel have set a very high bar for expectation of processor quality. In RISC-V, we have to try and follow that. First, there is the standard verification-type of activities, making sure it works, making sure the micro-architecture is sound, and getting the compatibility with the instruction set. Then, you have to make sure these devices actually fit on the system they’re going to work with. Do they fit in terms of interrupts working properly? Is the system coherent? Does the load store work correctly between the processor and the rest of the system? Is it efficient enough? Are there bottlenecks? All these factors have to be tested in processors that don’t have to be tested in regular blocks or the uncore blocks. This is the big difference. And to try and realize this verification, it’s very complex. Arm has invested something like $150 million. There are a large number of clock cycles required to verify these cores. RISC-V is still building up toward that, and they have a long way to go.
Hardee: The thing that’s different about processor verification is that what we’re trying to achieve is that every instruction in the ISA gives exactly the same results, behaving exactly to the programmer’s manual in every circumstance. The big difference between processor verification and ASIC verification is that ASICs generally are used in a more constrained environment for a more-defined purpose, whereas processors are used in completely unforeseen ways. Basically, a processor has to operate exactly to the programmer’s manual, no matter what the programmer is trying to do, or what application it’s going into. It’s a much tougher problem than general ASIC verification.
Eftimakis: To create a RISC-V CPU, and many have tried, you have to be either completely reckless, or you have to be a verification expert. An IP block has a defined set of inputs and a defined set of outputs. A processor has all the possible programs that can be invented in the future, and has been inventing in the past. So that means it’s impossible to be complete, or even impossible to foresee all the possible combination of events that will happen, combined with random interrupts coming in, etc. It requires a lot of complex verification at a level that you don’t need for standard IP, especially when you know in which configuration it will be used.
Davidmann: These processors at the high end need thousands of millions of millions of instructions — 1015 is what Arm quotes for the number of cycles it takes to verify a high-end application core. RISC-V is going to get to those cores. This is something that most people have never come across. They are verifying small blocks of UVM doing unit tests, putting together an ASIC, but they license in these huge processors, so very few people do it. It’s an unknown art form publicly, but the challenge of RISC-V takes it to the next level because there is so much in the RISC-V spec that is implementation-defined. In the privilege spec of it, you can go down all these different ways, and they’re all legal. So the first challenge is huge, and people don’t have that experience. The second challenge is, there are all these choices you can make. These are implementation choices, and they’re all perfectly legal, and all should get the same outcome. And then the third one is, the reason that RISC-V exists, or is going to be accepted, is because of the freedom it gives you to innovate and choose new designs. That means that every processor is likely to be different by design, but yet still has to be compliant. So you have three nightmares, and you have to figure out how you can do this, how you can replicate that amount of verification. We calculated it requires 30 years on an emulator. That’s a huge amount of verification. And then you’ve got all these implementation choices, so everybody can be different, and how do you test that? Another challenge is how do you ensure compliance when you’re extending it? These are three or four real big challenges, for which the world does not have the experience.
Tran: What’s the difference between verifying a processor core versus ASIC, or any other type of peripheral? The one thing that I noticed is it is context-dependent. The behavior of a particular instruction depends on what executed before it. You can’t really verify the behavior of a processor simply by feeding it a bunch of random instruction, and seeing if it executes the same way. Contextually, even in a single processor, you’ve got user mode versus system mode, and you’ve got a sequence of instructions that need to be verified, and these are not being done thoroughly today. When RISC-V grows up and starts to become more viable for things like Linux, you will see the verification challenge becoming compounded by the fact that there aren’t enough mechanisms to be able to thoroughly vet the behavior of RISC-V under the context of something more complicated than just an RTOS or bare-metal operating system.
Vittal: Processor verification is challenging, as everybody else has said. There is a lot of flexibility, and exhaustive validation of a processor is not easy. The main difference between an ASIC and a processor is there’s a lot more data path involved. We all know the famous Intel floating point bug. When you have so much data path, and when you don’t know the different applications of this processor in a given environment, because you’re not looking for every single combination, you need some very specific techniques, methodologies, skills, and adoption of tools, and so on. For a standard ASIC, you have different IPs. You can probably have protocols, and you have verification IPs to help verify certain parts of the ASIC. But for the processor, you have to dive deep and verify all possible combinations.
Beyer: Processor verification used to be a discipline limited to the Arms and Intels of this world. They had the expertise, methodologies, and the compute power to actually get it done properly. And now, with the free RISC-V, it is essentially everybody trying to replay this in order to make full use of the freedom of RISC-V. On one hand this has the same complexity issues that Intel and Arm were having, but on the other hand, because of the freedom to customize, if you add your own custom registers and instructions, you add a whole new layer of complexity beyond what a fixed processor is faced with.
SE: We used to have validation and verification, but it seems as though we’ve got a third one that’s thrown into the mix, which is conformance. How do these things relate to each other? How much overlap or separation is there between each of those tasks?
Hardee: I’m not sure that conformance is a separate thing, but we definitely have to consider both validation and verification. In the RISC-V world, there are ways of describing the processor that I’m trying to create. Chisel is one of the leading ways of doing that. What I need to do is to validate that I’m creating a processor that meets the requirements, that has everything in the ISA that is needed, and that the ISA is implementable. Then there is verification, which is checking that every piece of that ISA, the implementation of that ISA, is doing exactly the right thing in every circumstance, in every combination, including ones that are not foreseeable when you’re starting out — anything a programmer might do with this thing in the future. So those ideas of validation and verification already cover it. What am I validating against, and what am I verifying against? That’s where a golden reference model of the ISA is becoming exceedingly important. Models based on SAIL, and variants of SAIL, are emerging, which have a more complete formalized definition of what’s being built. Things like that are exceedingly important to verify against, to make sure that I’m covering all of these eventualities. We’re still at a nascent stage there. Chisel borrowed some concepts from other programming languages, etc. SAIL made an improvement on that in terms of moving toward a golden reference that can be verified against. Look at what Arm and Intel have done in terms of their disciplined approach to describing their ISA, and how that ISA is captured. Arm has presented at the Jasper user groups previously about formally verifying the ISA. Having that golden reference model, that is truly golden, and being able to verify against it with all the methods under the sun. We’ve talked about simulation and emulation, but to find all of the eventualities, formal methods are indispensable. So all of those things have to be used. Compliance always has been a big thing. Intel made a huge deal about compliance to the x86. Same thing with Arm. I’m not sure that compliance is a distinctly different thing. You can cover it with validation and verification, you just got to be a lot more disciplined about it.
Davidmann: Chisel came out of Berkeley, but the industry is not using Chisel. Very few people — just a couple of academics, one or two companies. The majority of designs are done in Verilog and SystemVerilog. And why is that? It is because you cannot verify something that you can’t debug. Everybody might use generators and all this sort of stuff to generate it, but all design is really done in Verilog or SystemVerilog. The future is going to be better languages. I’m a big fan of architecture definition languages and Chisel constructing hardware. These are great, and the next generation of SystemVerilog is going to have that. But before it’s adopted, it’s going to have simulators, debuggers everything with it. You cannot just have a new language. When we built SystemVerilog 20 years ago, that’s where we started. And it drove us nuts, because you couldn’t debug in the language you wrote in. Chisel still has that problem.
Eftimakis: It is the right time to talk about high-level languages, because that’s exactly what we do to develop our processors. We have a methodology and tools that not only generate the RTL from the high-level language, but also generate the software environment, debugger, the compiler, etc. And that ensures that everything fits together. That’s needed if you want to be able to work with it.
Tran: If you look at RISC-V, Chisel may be seen as the prevailing language used to create RISC-V, but there are others, such as SpinalHDL, Bluespec, and there are other varieties in addition to those. What we’re talking about is the diversity of different implementations that are being used in different ways. Keep in mind, RISC-V has not yet been deployed as the primary compute resources in embedded systems. It’s always some sort of microcontroller, or at best some accessory inside of a system on chip.
Eftimakis: That’s not true. In embedded systems, you have plenty of systems that have RISC-V as the primary processor.
Leave a Reply