Research Bits: April 19


Processor power prediction Researchers from Duke University, Arm Research, and Texas A&M University developed an AI method for predicting the power consumption of a processor, returning results more than a trillion times per second while consuming very little power itself. “This is an intensively studied problem that has traditionally relied on extra circuitry to address,” said Zhiy... » read more

Benchmarking Memory-Centric Computing Systems: Analysis of Real Processing-in-Memory Hardware


Abstract "Many modern workloads such as neural network inference and graph processing are fundamentally memory-bound. For such workloads, data movement between memory and CPU cores imposes a significant overhead in terms of both latency and energy. A major reason is that this communication happens through a narrow bus with high latency and limited bandwidth, and the low data reuse in memory-bo... » read more

SparseP: Towards Efficient Sparse Matrix Vector Multiplication on Real Processing-In-Memory Systems


Abstract "Several manufacturers have already started to commercialize near-bank Processing-In-Memory (PIM) architectures. Near-bank PIM architectures place simple cores close to DRAM banks and can yield significant performance and energy improvements in parallel applications by alleviating data access costs. Real PIM systems can provide high levels of parallelism, large aggregate memory bandwi... » read more

Hardware Architecture and Software Stack for PIM Based on Commercial DRAM Technology


Abstract: "Emerging applications such as deep neural network demand high off-chip memory bandwidth. However, under stringent physical constraints of chip packages and system boards, it becomes very expensive to further increase the bandwidth of off-chip memory. Besides, transferring data across the memory hierarchy constitutes a large fraction of total energy consumption of systems, and the ... » read more

Power/Performance Bits: Aug. 25


AI architecture optimization Researchers at Rice University, Stanford University, University of California Santa Barbara, and Texas A&M University proposed two complementary methods for optimizing data-centric processing. The first, called TIMELY, is an architecture developed for “processing-in-memory” (PIM). A promising PIM platform is resistive random access memory, or ReRAM. Whil... » read more

Newer posts →