Re-Targetable LLVM C/C++ Compiler For RISC-V


RISC-V is a modular instruction set architecture (ISA) with great customization capabilities that enable innovation and differentiation without fragmentation. On top of the baseline modules from ratified/standard ISA extensions, such as integer instructions or floating-point instructions, designers can add custom instructions: pure design freedom! And the reasons for adding instructions are man... » read more

Compiler Optimization Made Easy


In a previous blog post, we discussed the benefits of using automation to maximize the performance of a system. One use case I mentioned was compiler flag mining, and the fact that performance is available beyond the standard optimization flags provided by most compilers. Getting to this untapped performance is a difficult problem to solve, but fortunately there is an easy way. A universe of o... » read more

Optimizing Hardware Capacity, Utilizing Automatic Differentiation to Efficiently Compute Derivatives in Parallel Programming Models


A technical paper titled "Scalable Automatic Differentiation of Multiple Parallel Paradigms through Compiler Augmentation" was published by researchers at MIT (CSAIL), Argonne National Lab, and TU Munich. The paper was a Best Paper Finalist and a Best Student Paper winner at SuperComputing 2022. Find the technical paper here. Published November 2022. The work "demonstrates how Enzyme opti... » read more

Customizing Processors


The design, verification, and implementation of a processor is the core competence of some companies, but others just want to whip up a small processor as quickly and cheaply as possible. What tools and options exist? Processors range from very small, simple cores that are deeply embedded into products to those operating at the highest possible clock speeds and throughputs in data centers. I... » read more

Research Platform for Heterogeneous Computing (ETH Zurich)


New academic paper from ETH Zurich, "HEROv2: Full-Stack Open-Source Research Platform for Heterogeneous Computing." Abstract: "Heterogeneous computers integrate general-purpose host processors with domain-specific accelerators to combine versatility with efficiency and high performance. To realize the full potential of heterogeneous computers, however, many hardware and software design ... » read more

Big Changes In Embedded Software


Every good hardware or software design starts with a structured approach throughout the design cycle, but as chip architectures and applications begin focusing on specific domains and include some version of AI, that structure is becoming more difficult to define. Embedded software, which in the past was written for very narrow functions with a minimal footprint, is increasingly getting blended... » read more

Better Benchmarks Through Compiler Optimizations: Codasip Jump Threading


The architectural efficiency of embedded processor IP is measured by a small set of industry standard benchmarks, that even though often bear little correlation to real workloads, continue to persist. The most popular benchmarks are Dhrystone and CoreMark. An interesting observation regarding these test suites is that the performance numbers continue to improve for a given architecture, even... » read more