RISC-V Targets Data Centers

Open-source architecture is gaining some traction in more complex designs as ecosystem matures.

popularity

RISC-V vendors are beginning to aim much higher in the compute hierarchy, targeting data centers and supercomputers rather than just simple embedded applications on the edge.

In the past, this would have been nearly impossible for a new instruction set architecture. But a growing focus on heterogeneous chip integration, combined with the reduced benefits of scaling and increasing demand for specialized accelerators, has opened the door wide to newcomers. There are an estimated 200 startups working on accelerators for AI/ML, and some of them are using RISC-V as a starting point.

What makes RISC-V particularly attractive is the ability to modify the source code. That has driven up its popularity in edge and embedded applications, but it also has sparked the interest of companies looking to use this open-source ISA for much higher-performance applications.

“In the beginning, it was very much targeted at low-end microcontrollers for cost-saving reasons,” said Simon Davidmann, CEO of Imperas Software. “Then it became people trying to do huge arrays of processes and high-end vectors. But in the last six months, there has been a lot of interest from companies pushing the envelope of high-capability cores, a mid-step between high-performance computing and embedded. These are cores that are multi-threaded, multi-core, multi-clustered, like a high-end Arm, and they’re pushing the speed envelope. They’re trying to go to 3 or 4 gigahertz, so these aren’t low-cost embedded cores, and they’re not huge arrays. They’re really high-performance application type of processors. What’s been seen in the RISC-V ecosystem initially was the low-end, ‘free’ efforts, to the high end ‘freedom’ work, now to these high performance cores, which shows that RISC-V as an architecture is applicable across all these different domains.”

To be sure, there won’t be one processor for high performance compute. But RISC-V can be another tool in the toolbox.

“When you look at high-performance compute, it’s not one big fast processor,” said Neil Hand, director of marketing for IC verification solutions at Siemens EDA. “That’s not what it’s about. All of the top computing systems in the EU are large-scale distributed computers. The size of an individual core in high-performance compute is not relevant, so any processor can go into high-performance compute. There are a couple of ways RISC-V can play into that. It could become a replacement for an Arm, and there are companies that are trying to do that and build more complex RISC-V processors. But it gets really interesting in that you can customize it. It doesn’t have to be a highly parallel processor or have to handle multiple threads or be multi-core, because when you look at high-performance compute, where are they used? A lot of them are used for specific applications. Do you want to do protein folding? Are you a military company trying to simulate decay of your military weapons over time in storage? When you start looking at those areas, you can build coprocessors for that specific application. And how do you build a co-processor? In the old days, you just built a hardware co-processor. Today, you could build a programmable co-processor, which could be RISC-V-based because you can take open source software and tool chains and make it all work. That’s not to say you couldn’t do the same thing with a hardware co-processor like an Arm or an Intel, but this gives you another degree of freedom.”

Signs of maturity
Another big change in this area is the growing number of tools, which makes it much easier to develop these customized accelerators or processors.

“The most exciting thing for anyone working in high-performance compute over the last few decades has to be the number of tools they have,” said Hand. “We’ve gone from the days of, ‘Oh, I’ve got to go build a Cray in order to have any chance to do anything meaningful, or let me go build a cluster of Intel PCs to go do something.’ Now you can build rich heterogeneous computing environments, and we see that with the cloud vendors — Google building ML processors, AWS adding Arm processors with accelerators, Baidu doing custom processes and using FPGAs in their systems, and so on. You can go to AWS and put a custom FPGA board into a cluster with a firewall. You can put a RISC-V CPU into an FPGA board in your [virtual] machine, and it’s all remotely controlled. The opportunities are huge, and it’s just another lever.”

None of this is being lost on the RISC-V community. “A nice feature of RISC-V is its extendability and its modular ISA, so there may be several types of RISC-V in the same HPC product,” said Zdenek Prikryl, CTO of Codasip. “Many of them support the vector processing of some sort to process a lot of data in parallel. Either it can be implemented by the upcoming RISC-V [V] extension, or by a custom VPU (vector processing unit). This vector support is needed to satisfy AI/ML or other data-intense computations common in the HPC domain. There are likely to be other RISC-Vs in the system that are there for control reasons or data-movement tasks. So, as you can see, you can use the same underlying ISA for different purposes, which was one of the RISC-V goals in the first place.”

In fact, the development of domain-specific architectures is currently the best solution in the race to exascale computing, according to Louie De Luna, director of marketing at Aldec. “The open-source and customizable RISC-V ISA offers a realistic and achievable path to get there, but it will take the entire industry to collaborate and work together in creating a solid infrastructure, software/hardware tool chain, and ecosystem. Innovations in various scientific fields are the main drivers behind HPC, and each field has a set of unique computing requirements and workloads. Those requirements can be satisfied by RISC-V’s modular design approach and customizable instructions. New systems and SoCs that will be based on a single ISA in the future offer many advantages. We know that the ISA is the main interface between hardware and software components, but it also serves as a contract between hardware and software teams. For today’s SoCs there are at least five different ISAs. Having a single ISA for the whole SoC will simplify many areas in the development and business cycles.”

That could significantly speed up processes for engineering teams.

“Most HPC algorithms still exist in C-like source code form, which allows them to be used by a wide community of software developers, scientists and researchers,” said Zibi Zalewski, general manager of the hardware division at Aldec. “Those algorithms require fast operation to solve real problems. Therefore, hardware accelerators like those based on FPGA devices became popular in HPC. To actually use the algorithms in FPGA-based hardware, the algorithms must be converted using high-level synthesis, or rewritten in an HDL to consider the specifics and benefits of the FPGA architecture. But this is a time-consuming process. It requires the proper skills and experience, which means the user group is limited to FPGA engineers. However, if RISC-V is implemented in the FPGA, the process might be faster and available to more users. Algorithms implemented in platforms such as FPGA acceleration boards, with RISC-V on-board, allows for hardware acceleration and scalability without the need to port the code to HDL. Also, the customizable nature of RISC-V allows the design team to tweak the processor to the algorithm requirements to achieve the best efficiency.”

Scaling a foundation
Much of this is consistent with the original goals for RISC-V. One of the founding principles was scalability.

“They wanted an ISA that would scale from tiny embedded things to high-performance vector applications, AI applications, and highly scalable applications, noted Megan Wachs, vice president of engineering at SiFive. “It was designed to go from 32-bit and 64-bit to 128-bit, all with the same ISA. The fact that RISC-V is being talked about at the edge is mostly timing, based on the way it’s grown up. It’s very new, it is growing very quickly, so it’s getting adopted first at the edge because the issues and standards have been hammered out really well for those parts. The soon-to-be-ratified vector extensions to RISC-V are solidifying, the hypervisor and other more complex extensions are moving up in the stack, and the RISC-V Foundation is hammering those out and solidifying those. As a community, RISC-V is moving up.”

The ratification, and the overall RISC-V vector development, has spurred the interest in HPC for RISC-V because everyone is more or less following the lead of the European Processor Initiative (EPI), noted James Prior, head of global communications for SiFive. “They’re using RISC-V vectors, and RISC-V-specific IP to build the AI accelerator portion of their SoCs. We’ve seen a Cambrian explosion of AI architectures, which is wonderful. Everybody loves competition. It spurs innovation, creates all kinds of immense solutions, but it also has this problem of, now everybody’s working on the same thing, which is the same software but for a different platform. What people are looking for is commoditization/standardization so they can adopt without feeling like they are doing all of the work. They want to leverage more of a community effort.”

Aldec’s De Luna agreed. “The flexibility of the open-source RISC-V ISA is one of its greatest features, but it also can lead to its greatest downside, which is fragmentation. If the flexibility somehow becomes unconstrained, this can lead to incompatibilities. Software applications, OS, compilers, debuggers that work on a given implementation may not be compatible with another implementation, and this needs to be prevented.”

Coupled with this is the challenge of choosing the right RISC-V core. “There are a variety of RISC-V cores already available,” Zalewski said. “It’s a challenge to know which one to choose. Which one is mature enough, fast enough and easily scalable for HPC applications? This requires knowledge and expertise.”

Next steps
Codasip’s Prikryl expects to see RISC-V cores showing up as the main CPUs soon. “We just need to finish some homework, like the hypervisor extension, and implement advanced versions of RISC-V. A last important consideration is that in these use-cases, the memory subsystem is critical as well. The system has to feed the vector processing unit with data properly, so all the parallelism can be properly exploited.”

That also raises the visibility for RISC-V to a new level, for better or worse. “If you’re going to use a new tool from the toolbox, you have to make sure the tool doesn’t break,” said Siemens EDA’s Hand. “The challenge with RISC-V, once you customize it, is that while you can put in custom instructions to get 10, 20, 100X improvement in performance, you’ve got to make sure it works.”

This means flexible verification solutions are needed, including machine learning applied to tools, better verification environments, and the use of Portable Stimulus. “All of these things are all helping to get us to that point,” Hand said. “With RISC-V you have a framework, so 90% of your work is done. You’ve got the core instruction set, core compilers, you know how to hook into the APIs, and that’s where it gets interesting for high-performance compute. If you’re going to build one, you could go off and build a custom processor, or you could go off and utilize something that exists today. It’s not intended to compete with Intel’s PC desktop, but you can put 2,000 of these on a core, put that chip in a data center, and now you’ve got a ridiculously fast system for whatever it is you want to look at. But again, we’ve got to come back to how do we verify them?”

Supercomputing with RISC-V
One example of work happening in the RISC-V HPC space is at the Barcelona Supercomputing Center.

“Europe is trying to understand how it becomes relevant in the high-performance and digital age,” said John Davis, MareNostrum Experimental Exascale Platform (MEEP) coordinator at the Barcelona Supercomputing Center. “There’s a very big push for digital autonomy or sovereignty, and that sovereignty piece is really what’s driving this view with regard to high performance computing. From a BSC perspective, we’re proposing a completely open stack for HPC, from processors and accelerators all the way up to the software side. We’ve already done that on the software side with Linux. We were the first supercomputing center to deploy Linux in 2004 as a supercomputer operating system. We look at various pieces of the software stack that are both open and closed source, but there are many open source variants for many of the HPC components out there. We need to create a specialized infrastructure, and without an open source ecosystem it’s very hard to do that. With RISC-V, we have that capability to not only change the hardware, but also the software, whereas before it was only software changes. Then the question becomes a maturity question in terms of ecosystem.”

On the hardware side, that’s close, Davis said. “You can build a processor with any ISA and get things done, and because you have the capability to customize that ISA, you can add instructions, etc., and you can make it work. From a software ecosystem point of view, while open source software is very useful, you still have to port that ecosystem to the platform so software maturity is probably the biggest impediment to broad scale adoption of RISC-V, and that’s going to take time.”

Still, there are a lot of pieces that need to be developed for supercomputing, including driver support, Fortran support for the latest version of Fortran, the OS, libraries, and application software. “We’re doing full stack support from hardware all the way up to software,” Davis said. “We taped out a chip in May 2019. It’s a microcontroller, but it’s a first step to get us on the path of building chips that have features you might find in the high-performance computing space. We’re also building accelerators that have similar type of functionality.”

Conclusion
It’s still not entirely clear where RISC-V will gain the most traction, but the market for this open-source ISA definitely appears to be widening. There are still issues with tools and verification, and unanswered questions about how well it performs versus more commercially oriented rivals. But at the very least, no one is still questioning whether RISC-V has a future and whether it’s a serious contender for marketshare and mindshare.

The question now is how it performs under real-world conditions in high-performance applications. Much of the design world will be watching closely.



Leave a Reply


(Note: This name will be displayed publicly)