Systems & Design
SPONSOR BLOG

Re-Targetable LLVM C/C++ Compiler For RISC-V

Enabling a compiler to use both standard and custom instructions automatically and wisely.

popularity

RISC-V is a modular instruction set architecture (ISA) with great customization capabilities that enable innovation and differentiation without fragmentation. On top of the baseline modules from ratified/standard ISA extensions, such as integer instructions or floating-point instructions, designers can add custom instructions: pure design freedom! And the reasons for adding instructions are many: better performance, smaller memory footprint, lower power consumption, or anything in between.

This means one important thing: the software (the final application(s)) is compiled for the particular RISC-V ISA. The software development kit (SDK) must know which ISA modules the RISC-V processor implements, so it can automatically leverage them. This includes both standard instructions and custom instructions.

But how do you get the best SDK for a given RISC-V ISA? Let’s have a look in this blog as we focus on the C/C++ compiler, an essential part of the SDK. The compiler must, as much as possible, be able to automatically and wisely use instructions.

What is an LLVM C/C++ compiler?

LLVM (Low Level Virtual Machine) is a set of compilers and other tools, such as assembler, linker, or debugger. Let’s focus on the compiler.

Like any other compiler out there, the LLVM compiler can be split into three parts: front-end, optimizer (also known as mid-end), and backend, as shown in the following picture. Each of these layers has a different purpose.

LLVM compiler front-end

The front-end takes a text file as input, which is a source code in C or C++. It parses it and creates an intermediate representation (IR). The IR represents the input in a machine format.

LLVM compiler optimizer

The optimizer takes the IR and mostly targets independent optimizations such as loop unrolling or constant/variable propagation and produces an optimized IR.

LLVM compiler back-end

The backend then takes the optimized IR and performs target-dependent optimizations, register allocation, stack manipulation, etc. At the end, it produces the assembly code for the processor.

For re-targetability, the back-end is the most important part because it must know about the target architecture, its instructions, etc. And it is the back-end that Codasip’s solutions generate.

Re-targeting the LLVM C/C++ compiler

This is how we do it at Codasip.

CodAL: a single source of the processor description

The RISC-V ISA including the custom instructions is captured using the CodAL language, a C-based processor description language. The language captures all the important information including the instruction’s text form, its binary encoding, and more importantly its behavior. The CodAL description also contains information about how different types of hazards are handed (it influences the instruction scheduling), how the instructions are implemented (single cycle, multi-cycle), the processor’s application binary interface (for example, which registers are used for stack), or other microarchitectural description. It also contains a description of other C/C++ compiler features (for example peephole optimizations).

The following example shows a simple instruction that represents an average of two numbers.

How it all works

From this description, a wide range of tools are generated. The following figure shows the overall methodology. As you can see, one of the generated outputs is the LLVM C/C++ compiler.

The C/C++ compiler generator parses through all described instructions and microarchitectural description. It then extracts the instruction semantics, ABI, or timing, and generates a new back-end and configuration files for the front-end and optimizer. In other words, the front-end and optimizer are precompiled and configured to enable fast design space exploration. The back-end, on the other hand, needs to be compiled. The generated back-end is aware of every single instruction that the RISC-V processor has. Note that the instruction above can be used automatically by the generated C/C++ compiler or it can be used through automatically generated intrinsic or inline assembly.

What is unique as well is that the generated back-end is open to the designers. If the designers want to add a new LLVM optimization pass, for instance they have one written already in C++, they can.

We also improved the vanilla LLVM. We added advanced optimization passes that target performance gains (for example improved jump threading, superblock scheduling, or loop collapsing/flattening), code size lowering (for example improved -msave-restore, improved support for instructions with multiple outputs, or machine outliner), or DSP features (for example zero overhead loops, dual-stack architecture support, or load/store with post/pre increment).

Let’s see how the generated C/C++ compiler stands in the benchmarks.

Results

Let’s focus on two aspects of the compiler: performance and code size.

Coremark and Dhrystone are used for measuring performance, and Embench-iot is used for measuring the code size. Three compilers are compared: GCC, Vanilla LLVM, and Codasip LLVM. The comparison is done relative to each other and the vanilla LLVM is the reference (that is, it scores 1 in the graphs). The RISC-V ISA was configured as RV32IMCB.

Performance results

The performance comparison shows the improvements of the Codasip LLVM and how it outperforms other compilers. Note that similar optimization flags were used across the compilers. Custom instructions can significantly improve the results.

Code size results

The code size comparison also shows the benefits of the Codasip LLVM, even though the difference is not huge. Here again custom instructions can significantly improve the results. By the way, this is what my colleague Tariq Kurd explained in his blog post about RISC-V code size and adding a new Zc instruction.

It’s time to automate your innovations

The RISC-V ISA is still evolving, and we need ways to easily explore different instructions and their impact on software (in terms of performance, code size, or power). Automation is highly desirable so we can quickly and efficiently explore the design space. Or, if designers want to innovate or differentiate, then they need tools and languages that enable them to do so.

The elegant way to do this is with Codasip Studio. Our processor design toolset automatically generates all the needed parts from a single source of truth. The generated LLVM C/C++ compiler can use the new instructions automatically (that is, no need to change your C/C++ code, unless you want to), but not only that. It also performs really well in terms of performance and code size.

Icing on the cake, Codasip Studio generates other outputs such as executable models, RTL, or verification tools that complete the IP package, and the innovative RISC-V architecture can be deployed in the final product.



Leave a Reply


(Note: This name will be displayed publicly)