Algorithm HW Framework That Minimizes Accuracy Degradation, Data Movement, And Energy Consumption Of DNN Accelerators (Georgia Tech)


This new research paper titled "An Algorithm-Hardware Co-design Framework to Overcome Imperfections of Mixed-signal DNN Accelerators" was published by researchers at Georgia Tech. According to the paper's abstract, "In recent years, processing in memory (PIM) based mixed-signal designs have been proposed as energy- and area-efficient solutions with ultra high throughput to accelerate DNN com... » read more

Polynesia, A Novel Hardware/Software Cooperative Design for In-Memory HTAP Databases


A team of researchers from ETH Zurich, Google and Univ. of Illinois Urbana-Champaign recently published a technical paper titled "Polynesia: Enabling High-Performance and Energy-Efficient Hybrid Transactional/Analytical Databases with Hardware/Software Co-Design". Abstract (partial) "We propose Polynesia, a hardware–software co-designed system for in-memory HTAP [hybrid transactional/anal... » read more

ETH Zurich: PIM (Processing In Memory) Architecture, UPMEM & PrIM Benchmarks


New paper technical titled "Benchmarking a New Paradigm: An Experimental Analysis of a Real Processing-in-Memory Architecture" led by researchers at ETH Zurich. Researchers provide a comprehensive analysis of the first publicly-available real-world PIM architecture, UPMEM, and introduce PrIM (Processing-In-Memory benchmarks), a benchmark suite of 16 workloads from different application domai... » read more

SOT-MRAM-based CIM architecture for a CNN model


New research paper "In-Memory Computing Architecture for a Convolutional Neural Network Based on Spin Orbit Torque MRAM", from National Taiwan University, Feng Chia University, Chung Yuan Christian University. Abstract "Recently, numerous studies have investigated computing in-memory (CIM) architectures for neural networks to overcome memory bottlenecks. Because of its low delay, high energ... » read more

Research Bits: April 19


Processor power prediction Researchers from Duke University, Arm Research, and Texas A&M University developed an AI method for predicting the power consumption of a processor, returning results more than a trillion times per second while consuming very little power itself. “This is an intensively studied problem that has traditionally relied on extra circuitry to address,” said Zhiy... » read more

Benchmarking Memory-Centric Computing Systems: Analysis of Real Processing-in-Memory Hardware


Abstract "Many modern workloads such as neural network inference and graph processing are fundamentally memory-bound. For such workloads, data movement between memory and CPU cores imposes a significant overhead in terms of both latency and energy. A major reason is that this communication happens through a narrow bus with high latency and limited bandwidth, and the low data reuse in memory-bo... » read more

SparseP: Towards Efficient Sparse Matrix Vector Multiplication on Real Processing-In-Memory Systems


Abstract "Several manufacturers have already started to commercialize near-bank Processing-In-Memory (PIM) architectures. Near-bank PIM architectures place simple cores close to DRAM banks and can yield significant performance and energy improvements in parallel applications by alleviating data access costs. Real PIM systems can provide high levels of parallelism, large aggregate memory bandwi... » read more

Hardware Architecture and Software Stack for PIM Based on Commercial DRAM Technology


Abstract: "Emerging applications such as deep neural network demand high off-chip memory bandwidth. However, under stringent physical constraints of chip packages and system boards, it becomes very expensive to further increase the bandwidth of off-chip memory. Besides, transferring data across the memory hierarchy constitutes a large fraction of total energy consumption of systems, and the ... » read more

Power/Performance Bits: Aug. 25


AI architecture optimization Researchers at Rice University, Stanford University, University of California Santa Barbara, and Texas A&M University proposed two complementary methods for optimizing data-centric processing. The first, called TIMELY, is an architecture developed for “processing-in-memory” (PIM). A promising PIM platform is resistive random access memory, or ReRAM. Whil... » read more