The Uncertainties Of RISC-V Compliance

Tests and verification are required, but that still won’t guarantee these open-source ISAs will work with other software.

popularity

How far can a RISC-V design be pushed and still be compliant?

The answer isn’t always black-and-white because the RISC-V concept is very different from previous open-source projects. But as interest and activity in RISC-V continues to grow, constructive discussions are taking place to address some of the challenges of designing with an open-standard ISA.

“The RISC-V standard is something implementations are compatible with,” said Mark Himelstein, CTO at RISC-V International. “However, we generally don’t use the word ‘compliant,’ because the ultimate goal is to run apps across implementations that are compatible with a Profile. Compliance is a means to reach compatibility.”

One of the problems in narrowing down either of those definitions is historical context. Unlike some open processor projects, such as OpenSPARC, there is no source code involved. OpenSPARC provided access to RTL descriptions of Sun Microsystems processors rather than providing an ISA like RISC-V.

“RISC-V differs from earlier RISC instruction sets by being modular,” said Roddy Urquhart, senior technical marketing director at Codasip. “RISC-V requires the use of a base integer instruction set like RV32I or RV64I, and has ratified various standard extensions that are optional. It has also defined instruction encoding space for custom instructions. For ISA compliance, a necessary condition is to use the appropriate base integer set but if a combination of standard extensions and/or custom extensions are used then the ISA is RISC-V compliant.”

Still, there are a lot of pieces to this puzzle, and lots of wiggle room for how it gets defined.

“To be RISC-V compliant, you really only have to implement a very small subset of instructions,” said Rob Aitken, distinguished architect at Synopsys. “Then you can say your design is RISC-V-compliant because it implements the minimum integers instruction set. The whole idea behind it was an accelerator could be bolted into the system, then you could have whatever opcodes, etc., you wanted for your accelerator, which could co-exist in the instruction set space with the RISC-V minimal instruction piece. All of that is philosophically allowed as part of the standard. So effectively, ‘compliant’ only means I implemented the pieces of RISC-V that I said I implemented, and anything else I did is within the extensibility framework that’s defined, and it is what it is. That’s a different beast than previous ISAs that are out there, where if there’s any extensibility at all, it’s very constrained.”

Richard Grisenthwaite, executive vice president and chief architect at Arm explained, “Compliance is binary: You are either 100% compliant or you are not compliant. If the software you want to run relies on instructions that you don’t build, then it won’t run on your hardware. And if you want software that runs on your device to be part of a widely used ecosystem, then it doesn’t make sense for it to use instructions that are not available on other hardware – this could be problematic in encouraging software writers to write non-portable software. This is why Arm and many other ISA providers including open-source players, are working with a collection of names extensions that effectively standardize the architecture. If you stay within those extensions and specifications then compliance is not an issue, but this does not leave you with the flexibility you might expect from open source.”

Understanding compliance in the context of designing with open-standard ISAs remains challenging.

“To do it properly needs unbelievably huge amounts of resources,” said Simon Davidmann, CEO of Imperas Software. “Compliance is about control. If you want to be compatible with the definition, someone has to provide capability to demonstrate and enforce compliance. What companies like Arm do is control it in lots of different ways, as all ISAs have done in the past. They say, ‘This is our definition. You’re not allowed to change it out of this envelope.’ With Arm, you can’t add instructions, you can’t change the decode, you can’t add registers, because if you do any of that, it’s not an ‘Arm’ and the license they let you have does not allow you to do it. Even Google, Samsung and others that are architecture licensees are legally not allowed to stop it being an Arm.”

Arm does, however, allow architecture licensees to change the RTL a bit, but not the source code. “You can add an extra stage in the pipeline, or you can implement it differently, which Apple has done in its M1 and M2,” Davidmann said. “But how do they live up to the fact that it’s still an Arm? How is it compliant to the Arm? Arm provides significant technology, depending on what you license and how much you’re allowed to fiddle with it. For example, someone who is just licensing a small core and not allowed to change it will just get some pretty simple compatibility tests to check it because they’re not allowed to do anything to change it. They can’t really change the RTL, so there’s not a lot to be done. They can synthesize it, and they could target different things and get different gates out, but they’re not allowed to change the RTL, whereas an Arm architecture licensee like Apple or Google or Samsung or MediaTek can change the pipeline stages. They can do what they like, and actually, some can start from scratch. They can say, ‘We’ll take your document, and we’ll just build it any way we like. How do we prove it’s compliant? For the architecture licensees as part of their millions of dollars, they get a huge amount of compatibility technology, which includes references, tests, and frameworks. Then they can run it through and find out if it’s still an Arm or not, and they’ve got to fix it.”

Further, while users of embedded processors frequently compile their software from source code, rich operating systems and applications that run on them are delivered as binaries.

Urquhart noted this requires the combination of a base integer set and optional standard extensions to be consistent. “RISC-V International has standardized this by defining profiles such as RVA22U64, which specify a mandatory combination of standard extensions. Thus, in some contexts, compliance with a profile in addition to the ISA is important. If the design is implemented as RTL, for example, can it be verified against an instruction-accurate model? This would require that the model includes the base integer instructions, selected standard extensions, and custom instructions.”

Codasip Studio, for example, can generate a UVM environment to do this. But to put this in perspective, the compliance process has an impact on the design or the design process.

The first “Profiles” were ratified this year, which contain a generation of extensions comprised of instructions state and behavior, that work together, according to RISC-V International’s Himelstein. “Extensions are either mandatory or optional. Implementations that choose to be compatible with a Profile must implement all the mandatory extensions. Profiles give OSes and tool chains a single target to support across the community.”

If there are non-compliance issues, the vendor publishes errata and a get-well plan. “If it is on purpose, which are allowed as custom changes, then the vendor just can’t brand their product as RISC-V Profile-compatible,” Himelstein said.

Every architecture has this issue. “We have a basic set of tests that verify against simulators that produce golden results,” Himelstein noted. “But it is up to the vendor to attest they have done adequate DV, system tests, software tests, etc. We are not a certifying body. It is the need to have a viable software economy that drives everyone to be Profile-compatible, so software vendors can have one release per OS type that works across multiple implementations. We are a community, and that community is a continuous improvement organization. We are always working on improving the simulator and tests.”

Risk of fragmentation
Fragmentation is always a worry in the software realm. While open hardware standards do not necessarily need to be supported by open-source compilers, there is a huge effort underway in both the LLVM communities and the GCC communities to support RISC-V, and deviation from the hardware standard has an impact on the open source implementation.

“In the RISC-V world, part of the standard that took a very long time to be ratified is the vector extension to the RISC-V standard,” explained Catherine Moore, software engineering director at Siemens Digital Industries Software. “A lot of hardware vendors really needed that extension before it was ratified, so many hardware vendors went off and implemented what they thought the standard might look like, and implemented their own version of the standard, etc. So now these things are hard-coded into their version of the standard, which deviates from what was eventually ratified. As a result, they have had to develop compilation tools to support their spinoff of the standard, and what we see, of course, is fragmentation.”

That, in turn, creates is opportunity for groups that provide custom software services for open-source compiler tools.

“When these hardware standards become fragmented it’s a huge support opportunity, because most people that are building open-source components want to have them submitted and accepted into the open source communities they’re building off of,” Moore said. “But GCC, for example, will not accept any submissions that deviate from the standard, so vendors that have one-off implementations of the vector extension in RISC-V need to support that outside of the community support that’s available through GCC or LLVM.”

The ramifications of the fragmentation are significant, and the effect is that it will be more costly to support because the part of the standard that is implemented differently in the design is outside of the community.

Even so, RISC-V does allow for a custom instruction set. “There is an extension in the RISC-V standard that allows vendors to create their own custom instructions,” Moore said. “Fragmentation is separate from that because there’s a standard for how to create the separate instructions. There’s an encoding that marks things as separate and proprietary. Those things should be considered part of this standard. Implementing an extension differently than the way the standard was ratified is where you run into trouble, at least in the software world.”

Sounding the alarm
For a RISC-V design to be compatible with another ISA requires huge amounts of verification compatibility tests.

“RISC-V is not advocating compliance in the same way that an Arm does,” said Imperas’ Davidmann. “They don’t have the resources to provide the compatibility tests. What they have is voluntary work, creating open community tests, which are basically very simple compatibility. I don’t believe there are very good compatibility suites yet. With our first work, when our model and our reference simulator was in the compatibility group, we created some tests and we created a reference to try and help with this, but there were no resources available. In commercial companies, you’ve got to think that Intel makes a fortune from the silicon so it can invest heavily in verification compatibility stuff. Arm makes a lot of money on its licensing the cores, so these have huge revenue streams, and what they do is invest it in their ecosystem, and a lot goes under compatibility. The RISC-V ecosystem has no funding. Everybody’s an individual. What they’ve done in the compatibility is said, ‘Here are some simple tests, you must run them through. When you’ve run them through, send us your data, and we’ll allow you to use the badge that you are RISC-V compatible.’ But that is not a commercially sound insurance policy.”

Additionally, compliance is a necessary but not sufficient component of verification.

“A microarchitecture design may have a variety of bugs that have nothing to do with ISA or profile compliance,” Codasip’s Urquhart said. “For example, there may be race conditions or problems interfacing with interrupts or caches. Given that a processor design has a very large state space, verification of the RTL is much more complex than hard-wired accelerator blocks. Usually, a variety of methods are combined to find bugs including direct tests, constrained random methods and formal verification.”

Synopsys’ Aitken agreed that compliance testing is tricky. “This is true whether it’s an open or closed ISA. It’s really that the compliance for a customizable or extendable object becomes more challenging. What do you even mean by it? If you have an illegal opcode in Arm or x86, it just says, ‘illegal instruction,’ end of story. But in a RISC-V design, depending on what the opcode is, it might actually be illegal opcode that is implemented. It might be something that somebody added. Do they have to tell anyone about it? There are many gray areas that can exist in this space, and because it’s potentially such a mess, people avoid it by picking well defined subsets that just sidestep the whole question.”

The difference between compliance and verification for a fixed ISA is a little bit clearer than it is for an extensible ISA. “Does this thing implement the instructions it’s supposed to? That’s an easy question to answer,” said Aitken. “Does it do the other things it claims it does? Does it do that in a compliant way? There’s a gray area between where that leaves off and where some kind of implementation bug begins, and it’s partly because the distinction between the ISA and architecture or microarchitecture and a specific implementation of that microarchitecture. If there’s only one, where does the problem exist? As a user of a thing, you might not have any idea. Even as a developer of the thing, you might not really be sure where in that continuum your problem is. If it’s ‘here,’ and we interpreted the spec wrong, then it’s a compliance issue. If we interpreted the spec correctly, but we implemented it wrong, that could be an architecture bug, a microarchitecture bug, or an implementation bug depending on what exactly you did because you implemented it wrong.”

How does the developer know for sure? “You don’t,” he said. “You just know that eventually you’ll have to figure out what’s wrong, and fix it. But the precise description of what was broken and what we describe it as is not necessarily clear. So it will come down to the pragmatic view of, ‘It didn’t work before, and it does now. Check the box.’ Why didn’t it work before? Well, it’s complicated.”

A cautionary note
At the end of the day, the compatibility tests in RISC-V are not verification, which means when they’re finished, they might cover 10% or 20% of the spec in terms of real capabilities, Davidmann said. “Where RISC-V really gets into trouble is there are so many choices a customer can make with the implementation that it’s actually very hard to write tests that work for all these different designs. The freedom they’ve given with RISC-V is brilliant from an industry that’s trying to build chips that are defined by software, but this freedom means it is a complete nightmare in terms of compatibility. The RISC-V industry has not really understood the complexity of this compatibility challenge, and they are absolutely not putting the resources in it.”

Arm’s Grisenthwaite added, “Compliance is really part of verification, but it is important to have precise and comprehensive specifications about what is needed to be built in order to be compliant. When it comes to individual instructions, that is straightforward; for other items, it is more complex, and subtle non-compliances can be as big an issue as blatant failures if the software relies on the behavior involved. And it’s important to remember that compliance is a subset of verification, which involves looking much more closely for issues that are specific to a particular implementation, including performance anomalies. Verification is obviously central to the design process – often it’s the most time-consuming component – and a verification methodology becomes trustworthy and resilient because it’s been deployed across millions of designs in a broad ecosystem of users.”

An extendable architecture makes it even harder.

“If you want an extendable ISA, that becomes much more challenging,” Davidmann concluded. “There is one company with the funding to invest in this. Community investment doesn’t work with hobbyists. When Arm had a problem with Linux, they put together Linaro, which was a $30 million- or $40 million-a-year investment. This can’t be hobby clubs. There must be full-time committed engineers. For RISC-V, you should probably have a team of 20 people full time building compatibility technology. It’s references, verification capability, and its tests. The impact of all of this is that if you’re licensing a processor, you need to know two things. Does it meet the standard so everything’s going to work? Have you got other bugs in it? Those are the questions that every IP vendor will be asked by their customers, and RISC-V International today is not concerned with the issues of compliance at the level that it should be. They say it’s on paper. Their actions do not indicate that they worry about it.”



Leave a Reply


(Note: This name will be displayed publicly)