SparseP: Towards Efficient Sparse Matrix Vector Multiplication on Real Processing-In-Memory Systems


Abstract "Several manufacturers have already started to commercialize near-bank Processing-In-Memory (PIM) architectures. Near-bank PIM architectures place simple cores close to DRAM banks and can yield significant performance and energy improvements in parallel applications by alleviating data access costs. Real PIM systems can provide high levels of parallelism, large aggregate memory bandwi... » read more

Hardware Architecture and Software Stack for PIM Based on Commercial DRAM Technology


Abstract: "Emerging applications such as deep neural network demand high off-chip memory bandwidth. However, under stringent physical constraints of chip packages and system boards, it becomes very expensive to further increase the bandwidth of off-chip memory. Besides, transferring data across the memory hierarchy constitutes a large fraction of total energy consumption of systems, and the ... » read more

Power/Performance Bits: Aug. 25


AI architecture optimization Researchers at Rice University, Stanford University, University of California Santa Barbara, and Texas A&M University proposed two complementary methods for optimizing data-centric processing. The first, called TIMELY, is an architecture developed for “processing-in-memory” (PIM). A promising PIM platform is resistive random access memory, or ReRAM. Whil... » read more

Newer posts →