Doing what has been done in the past only gets you so far, but RISC-V is causing some aspects of verification to be fundamentally rethought.
Semiconductor Engineering sat down to discuss gaps in tools and why new methodologies are needed for RISC-V processors, with Pete Hardee, group director for product management at Cadence; Mike Eftimakis, vice president for strategy and ecosystem at Codasip; Simon Davidmann, founder and CEO of Imperas Software; Sven Beyer, program manager for processor verification at Siemens EDA; Kiran Vittal, senior director of alliances partner marketing at Synopsys; Dave Kelf, CEO of Breker Verification, and Hieu Tran, president and CTO of Viosoft Corp. What follows are excerpts of that conversation.
SE: Arm and Intel seem to be the benchmarks, but neither ever had to deal with the customizability and extensibility that RISC-V has. Is following Arm and Intel the right path?
Hardee: I firmly believe RISC-V is going to be a huge player in domain-specific processing. Architecture licensees of Arm are able to configure the processor for specific domains. What we’re seeing with RISC-V is domain-specific data path processing being added to the RISC-V core to cover the various domains where it can find application. This is great. It is really powerful that RISC-V is able to do this. But, of course, it adds to the verification challenge. We are seeing huge adoption of formal, continued usage of dynamic verification, we mentioned that emulation continues to be important. Testing this stuff out with system cases, based on PSS tools. The appeal of RISC-V is the ability to be able to configure it better for domains than maybe is possible with existing, less flexible ISAs.
Davidmann: The technologies that have been built in the past inside Intel, Arm, and MIPS, have been very constrained in the freedoms they needed to worry about. With RISC-V, it is a complete explosion of the configurability that’s needed. We shouldn’t just learn from and copy what’s been done. We need to see what’s been done and apply it to the modern challenges. Fifteen years ago we built models around Arm and MIPS and PowerPC. Those models weren’t very configurable, extendable, or customizable. But what happened is that because we were independent from all of them, we could focus and build our technology so you could choose different bits of the ISA, different configurations. When RISC-V came along, we’d already thought about this scalability and configurability and extendability. We had 12 different ISAs. It is not just a simulator. It is the modeling capability, the extension capability, the debuggers, the analysis, the verification tools. RISC-V is such a challenge for everybody else, whereas our technology was already factored in a way in which we could apply it. We can’t just pick up the technologies that existed before for other ISAs. We need to work out how they can be adjusted to RISC-V. Look what happened with GCC. RISC-V has something like 70 extensions, and the C tool developers have absolutely given up and said, ‘There is no way we can meet and test 70 different interacting combinations.’ It’s impossible for the tool chain. They cannot live with those crazy configuration options. What RISC-V is doing is moving to what they call profiles and platforms. They are saying there is this profile, which is intended for embedded in this area, and it’s going to have these extensions as mandatory. They are trying to reduce that freedom because the existing compiler tool chains can’t cope. If we want to gain what RISC-V can offer, we have to build better technologies in the infrastructure around it, like the simulators, other tools like the compilers, like the verification tools. They all have to have the flexibility and freedom that RISC-V gives you to extend it and configure it. We can’t just take what existed before.
Vittal: For compilers, with all the combinations, it takes forever to get the performance. There is a lot of opportunity, and we have a tool that uses AI-enabled techniques to improve the performance of your software stack. Going back to the original question, the challenge is with respect to the domain-specific applications. When everybody is customizing to their needs, there is a lot of flexibility incorporated. How do you verify? That’s the biggest challenge.
Eftimakis: RISC-V has been introduced to enable this capability to customize — what we call custom compute. With Moore’s Law ending, this is the only way to gain in terms of performance or efficiency in general. RISC-V was there to do that. It requires a new mindset. It requires a new tool. And that’s why we have developed tools where you don’t think in a fixed way about one instruction set. We describe the instruction set in a high-level language, and then we generate everything behind the RTL, the compiler, etc. It doesn’t work if you use the old ways with fixed compilers and try to include all options in the same compiler.
Davidmann: For RISC-V, to give the designers the freedom they need, we need to address the issue of what the next-generation languages are. We are going to have to move to languages that allow this configurability of hardware, in a different way than is available in what I’d call implementation languages like SystemVerilog. That is implementation and goes into design compiler, but what you need is the ability to reconfigure your hardware at more abstract levels. I believe there is a future for language, and when that happens, RISC-V will be easier for everybody.
Kelf: Most of the RISC-V developments have been relatively small, embedded processors — single core types of devices. Now, companies are starting to look at application processors, multi-core devices, and tiled SoCs with many cores. This is where the rubber is really going to hit the road in terms of the verification of these devices. So far, everything has been relatively insular, looking at the processors themselves. Now we’re going to see these very advanced systems. Even after you get an application processor working, then we have to worry about things like coherency and other stuff like that. This is where we do need to borrow from Arm. I agree that we can’t just keep the old methodologies and expect to build these super-configurable processors, but we do have to stand on the shoulders of giants. We have to learn what Arm and Intel have put thousands of man hours into, trying to figure out how to deal with concurrency across these massively parallel systems and things like that.
Hardee: It is not just coherency. When you start moving from single-issue to more complex architectures, many things come in. The issues with branch prediction, the issues with prefetching instructions, with out-of-order execution — all of these things. You take the OpenTitan cores for root of trust. It doesn’t have any of that vulnerability. As you move into the bigger application architectures, you have to be more sophisticated to get the performance that’s needed. And that’s when a lot of this stuff comes in. So we’re at the tip of the iceberg.
Kelf: Concurrency is just one example. There are dozens of things like those you just mentioned, which we haven’t started to touch yet. These big companies have figured a lot of this out. We can borrow from those guys, but build the new things we have to for the configurability, for additional instructions. Tensilica and ARC have done this, as well. So we need to be looking at what those guys did.
Davidmann: With RISC-V, a lot of people are doing single-core embedded, but there are an awful lot of people who are putting down quite large, sophisticated processor arrays, whether they’re big vector engines, in-order, out-of-order, all of that sort of stuff. A lot of people are doing big processing elements of arrays. Some of them have multiple processors. RISC-V has not yet been used for high-performance application processes. When Arm started thinking about going into the data center, we thought it was going to take a long time, and it took them over 10 years. RISC-V has another 10 years before it’s got high-end application processors. Some will say we are getting there faster than that. That is where the real verification challenges and nightmares come. A small 32-bit core is hard, but the high-end, high-performance is much worse. We’ve got several customers that have application processes in RISC-V, and they say it boots Linux. When we applied our DV solution, in the first day we found 20 bugs. That was a single-threaded, 64GC, but it’s a relatively simple core. The future challenge for RISC-V verification is in the high-end application processors. They are multi-issue, out-of-order, virtual memory, cache coherency. That is where the real challenge for RISC-V verification is. We have out-of-the-box solutions for the low-end stuff, single-threaded, even out-of-order stuff, but the application processors, high-performance, that is where the big challenges come in.
Kelf: We all remember when Arm went from Arm7, to Armv8. It was a huge leap. It was exactly the same leap that we’re talking about here. Arm’s internal verification went from a lot, to 10<sup>15</sup> cycles. It was all these kinds of things that we’re talking about now. Arm did figure this out, and we do know a little bit about what they do internally. We’re all going to have to do the same thing for RISC-V, and we can learn from them, and then multiply it for all the new challenges.
Hardee: Arm has done many presentations, and one of the things with formal is that it may not scale to the big system. But Arm is using formal to verify things like the load store units. Then you’ve got to scale up that verification using simulation and emulation to get at the full-chip-level problems.
Kelf: And figure out things like coverage, too.
Davidmann: When we’re talking to people with RISC-V for verification, and they say they are doing an application core, if they haven’t got emulation, we get out quickly because they don’t take the problem seriously. For an application core, if they haven’t got emulation, they do not understand where they’re going.
Vittal: It’s a combination of both verification software and hardware. That requires emulation, prototyping, virtual prototyping. Some of the RISC-V vendors are providing reference boards with hardware prototyping solutions. We are working on that. That eases the pain for adopters.
SE: With configurable, high-level languages, you have talked about generating compilers and other things necessary for the flow. Do we need to be looking at how you generate the coverage model from these high-level specifications? Then people would have a consistent goal they’re trying to work toward for verification?
Hardee: A coverage model is a different concept. Going back to where we started with the differences between processor verification and ASIC verification, the same coverage models that cover all the scenarios against a verification plan for an ASIC don’t necessarily work for a processor. You have to cover every eventuality, including all of these unforeseen things — everything a programmer could possibly do with your core, now or in the future. And that’s where the traditional coverage models that we have, at the moment, are somewhat ineffective and not good enough.
Kelf: I would go further and say that we will have to rethink coverage. As we move toward broader application processors, trying to provide a functional coverage model to check load store into a memory, where you’re going to be double checking the same memory cells over and over again, just cannot work. UVM style coverage models are very important, but nowhere near sufficient to check these new application processors. Scenario coverage, and other specification coverage-type ideas, are going to become much more important.
Vittal: We have been using AI-enabled techniques. We look at the quality of your constrained random stimulus, and then we use machine learning to improve it.
Hardee: We are all doing that, but it’s not enough.
Davidmann: Functional coverage models — Verilog and SystemVerilog — are fantastic, and it does great for the instructions of a RISC-V. But the challenge is when we get into the application processors that have an MMU and TLB. You’ve got a hardware page table walker. It’s just memory. There’s one register, and then you’ve got six levels of page table walking when there is a page fault. It’s all memory data structures. You can’t write SystemVerilog functional coverage for that. The UVM paradigm, and SystemVerilog falls dead. We are having to create new methodologies and technologies so they know how much they’ve tested in their RTL. It doesn’t exist anywhere today. There is no documentation for how people need to do it. Scenarios are very exciting for the future. There’s new stuff coming. There are challenges and people don’t know publicly how to do coverage on things like the MMU.
Eftimakis: The tools are not perfect. Each of them needs to be combined. This is what we call the Swiss cheese model. Imagine you have slices of cheese with holes in them. The bugs can go through the holes, but if you stack up plenty of cheese slices, you end up finding way to block the bugs. We need to accumulate these methods, these tools, to make sure that we end up with good coverage. But indeed, it requires a lot of different methodologies, different tools. And that’s why we use tools from different vendors. And we use methodologies that we have adopted from what was developed by Intel and Arm. It is required to gain the right level that we need for CPUs because it’s really complex.
Davidmann: One of the first things you realize is that you can’t go build any of this stuff, you need to use what’s there today, or get it built specifically for what you want. So the idea of waiting for open source to solve your problems is just wrong. You can’t verify anything without commercial tools.
Vittal: How do we democratize and how do we spread the knowledge? That’s where our challenges is, and we are making that happen.
Kelf: RISC-V verification represents the state of the art in verification. There’s a whole bunch of new thinking, which will be applied to a whole bunch of other verification problems.
Hardee: The winners in RISC-V will be the ones that invested in verification. There are some RISC-V players that are not investing in that, and they see it as the cheaper option by not paying the license fees, but those are dying out. It’s becoming very clear that investing in verification is absolutely key to success in the RISC-V market.
Leave a Reply