New Ways To Optimize GEMM-Based Applications Targeting Two Leading AI-Optimized FPGA Architectures


A technical paper titled “Efficient Approaches for GEMM Acceleration on Leading AI-Optimized FPGAs” was published by researchers at The University of Texas at Austin and Arizona State University. Abstract: "FPGAs are a promising platform for accelerating Deep Learning (DL) applications, due to their high performance, low power consumption, and reconfigurability. Recently, the leading FPGA... » read more

Optimizing EDA Cloud Hardware And Workloads


Optimizing EDA hardware for the cloud can shorten the time required for large and complex simulations, but not all workloads will benefit equally, and much more can be done to improve those that can. Tens of thousands of GPUs and specialized accelerators, all working in parallel, add significant and elastic compute horsepower for complex designs. That allows design teams to explore various a... » read more

ISA and Microarchitecture Extensions Over Dense Matrix Engines to Support Flexible Structured Sparsity for CPUs (Georgia Tech, Intel Labs)


A technical paper titled "VEGETA: Vertically-Integrated Extensions for Sparse/Dense GEMM Tile Acceleration on CPUs" was published (preprint) by researchers at Georgia Tech and Intel Labs. Abstract: "Deep Learning (DL) acceleration support in CPUs has recently gained a lot of traction, with several companies (Arm, Intel, IBM) announcing products with specialized matrix engines accessible v... » read more