Index-Based Multi-Core BDD Package With Dynamic Memory Management & Reduced Fragmentation


A technical paper titled "EDDY: A Multi-Core BDD Package with Dynamic Memory Management and Reduced Fragmentation" was published by researchers at University of Bremen. ABSTRACT "In recent years, hardware systems have significantly grown in complexity. Due to the increasing complexity, there is a need to continuously improve the quality of the hardware design process. This leads designers t... » read more

In-Memory Computing: Assessing Multilevel RRAM-Based VMM Operations


A new technical paper titled "Experimental Assessment of Multilevel RRAM-Based Vector-Matrix Multiplication Operations for In-Memory Computing" was published by researchers at IHP (the Leibniz Institute for High Performance Microelectronics). Abstract: "Resistive random access memory (RRAM)-based hardware accelerators are playing an important role in the implementation of in-memory computin... » read more

ISA and Microarchitecture Extensions Over Dense Matrix Engines to Support Flexible Structured Sparsity for CPUs (Georgia Tech, Intel Labs)


A technical paper titled "VEGETA: Vertically-Integrated Extensions for Sparse/Dense GEMM Tile Acceleration on CPUs" was published (preprint) by researchers at Georgia Tech and Intel Labs. Abstract: "Deep Learning (DL) acceleration support in CPUs has recently gained a lot of traction, with several companies (Arm, Intel, IBM) announcing products with specialized matrix engines accessible v... » read more

HBM-Enabled FPGA-Based Graph Processing Accelerator


A technical paper titled "ACTS: A Near-Memory FPGA Graph Processing Framework" was published by researchers at University of Virginia and Samsung. Abstract: "Despite the high off-chip bandwidth and on-chip parallelism offered by today's near-memory accelerators, software-based (CPU and GPU) graph processing frameworks still suffer performance degradation from under-utilization of available ... » read more

Review of Tools & Techniques for DL Edge Inference


A new technical paper titled "Efficient Acceleration of Deep Learning Inference on Resource-Constrained Edge Devices: A Review" was published in "Proceedings of the IEEE" by researchers at University of Missouri and Texas Tech University. Abstract: Successful integration of deep neural networks (DNNs) or deep learning (DL) has resulted in breakthroughs in many areas. However, deploying thes... » read more

Emulation System for Racetrack Memories Based on FPGA


A technical paper titled "ERMES: Efficient Racetrack Memory Emulation System based on FPGA" was written by researchers at University of Calabria and TU Dresden. "This paper presents a new emulation system for RTMs based on heterogeneous FPGA-CPU Systems-on-Chips (SoCs). Thanks to its high flexibility, the proposed emulator can be easily configured to evaluate different memory architectures. ... » read more

Advanced Packaging for High-Bandwidth Memory: Influences of TSV size, TSV Aspect Ratio And Annealing Temperature


A technical paper titled "Stress Issue of Vertical Connections in 3D Integration for High-Bandwidth Memory Applications" was published by researchers at National Yang Ming Chiao Tung University. Abstract: "The stress of TSV with different dimensions under annealing condition has been investigated. Since the application of TSV and bonding technology has demonstrated a promising approach for ... » read more

Side-Channel Attacks Via Cache On the RISC-V Processor Configuration


A technical paper titled "A cross-process Spectre attack via cache on RISC-V processor with trusted execution environment" was published by researchers at University of Electro-Communication, Academy of Cryptography Techniques, Technology Research Association of Secure IoT Edge Application based on RISC-V Open Architecture (TRASIO), and AIST. "This work proposed a cross-process exploitation ... » read more

Heterogeneous Multi-Core HW Architectures With Fine-Grained Scheduling of Layer-Fused DNNs


A technical paper titled "Towards Heterogeneous Multi-core Accelerators Exploiting Fine-grained Scheduling of Layer-Fused Deep Neural Networks" was published by researchers at KU Leuven and TU Munich. Abstract "To keep up with the ever-growing performance demand of neural networks, specialized hardware (HW) accelerators are shifting towards multi-core and chiplet architectures. So far, thes... » read more

Efficiently Process Large RM Datasets In Underlying Memory Pool, Disaggregated Over CXL (KAIST)


A technical paper titled "Failure Tolerant Training with Persistent Memory Disaggregation over CXL" was published (preprint) by researchers at KAIST and Panmnesia. "TRAININGCXL can efficiently process large-scale recommendation datasets in the pool of disaggregated memory while making training fault tolerant with low overhead," states the paper. Find the technical paper here. or here (IEE... » read more

← Older posts Newer posts →