SW/HW Codesign For CXL Memory Disaggregation In Billion-Scale Nearest Neighbor Search (KAIST)


A technical paper titled “Bridging Software-Hardware for CXL Memory Disaggregation in Billion-Scale Nearest Neighbor Search” was published by researchers at the Korea Advanced Institute of Science and Technology (KAIST) and Panmnesia. Abstract: "We propose CXL-ANNS, a software-hardware collaborative approach to enable scalable approximate nearest neighbor search (ANNS) services. To this e... » read more

Formally Verifying Data-Oblivious Behavior In HW Using Standard Property Checking Techniques


A technical paper titled “A Scalable Formal Verification Methodology for Data-Oblivious Hardware” was published by researchers at RPTU Kaiserslautern-Landau and Stanford University. Abstract: "The importance of preventing microarchitectural timing side channels in security-critical applications has surged in recent years. Constant-time programming has emerged as a best-practice technique... » read more

An Open-Source Hardware Design And Specification Language To Improve Productivity And Verification 


A technical paper titled “PEak: A Single Source of Truth for Hardware Design and Verification” was published by researchers at Stanford University. Abstract: "Domain-specific languages for hardware can significantly enhance designer productivity, but sometimes at the cost of ease of verification. On the other hand, ISA specification languages are too static to be used during early stage d... » read more

Programmable HW Accelerators For BGV Fully Homomorphic Encryption In The Cloud


A technical paper titled “BASALISC: Programmable Hardware Accelerator for BGV Fully Homomorphic Encryption” was published by researchers at COSIC KU Leuven, Galois Inc., and Niobium Microsystems. Abstract: "Fully Homomorphic Encryption (FHE) allows for secure computation on encrypted data. Unfortunately, huge memory size, computational cost and bandwidth requirements limit its practic... » read more

Heterogeneous Multi-Core HW Architectures With Fine-Grained Scheduling of Layer-Fused DNNs


A technical paper titled "Towards Heterogeneous Multi-core Accelerators Exploiting Fine-grained Scheduling of Layer-Fused Deep Neural Networks" was published by researchers at KU Leuven and TU Munich. Abstract "To keep up with the ever-growing performance demand of neural networks, specialized hardware (HW) accelerators are shifting towards multi-core and chiplet architectures. So far, thes... » read more

Innovate By Customized Instructions, But Without Fragmenting The Ecosystem


This white paper reviews the design considerations for SoC designers when they deploy their hardware accelerators, and how software developers access the accelerators implemented using Arm Custom Instructions. Click here to read more. » read more

SW/HW Framework for for GASNet-enabled FPGA Hardware Acceleration Infrastructure


Researchers from KAIST and Flapmax published a new technical paper titled "FSHMEM: Supporting Partitioned Global Address Space on FPGAs for Large-Scale Hardware Acceleration Infrastructure." Abstract "By providing highly efficient one-sided communication with globally shared memory space, Partitioned Global Address Space (PGAS) has become one of the most promising parallel computing model... » read more

MIT & UC Berkeley: “Exo” Programming Language Writes High Performance Code For HW Accelerators


New research paper titled "Exocompilation for productive programming of hardware accelerators," from researchers at MIT and UC Berkeley. From their abstract: "To better support development of high-performance libraries for specialized hardware, we propose a new programming language, Exo, based on the principle of exocompilation: externalizing target-specific code generation support and op... » read more

Designing Hardware Accelerators Using A Data-Driven Approach


Research paper titled "Data-Driven Offline Optimization For Architecting Hardware Accelerators" by researchers at Google Research and UC Berkeley. Abstract "Industry has gradually moved towards application-specific hardware accelerators in order to attain higher efficiency. While such a paradigm shift is already starting to show promising results, designers need to spend considerable man... » read more

A Survey of Network-Based Hardware Accelerators


Abstract "Many practical data-processing algorithms fail to execute efficiently on general-purpose CPUs (Central Processing Units) due to the sequential matter of their operations and memory bandwidth limitations. To achieve desired performance levels, reconfigurable (FPGA (Field-Programmable Gate Array)-based) hardware accelerators are frequently explored that permit the processing units’ a... » read more

← Older posts