Improving DRAM Performance, Security, and Reliability by Understanding and Exploiting DRAM Timing Parameter Margins


Abstract: "Characterization of real DRAM devices has enabled findings in DRAM device properties, which has led to proposals that significantly improve overall system performance by reducing DRAM access latency and power consumption. In addition to improving system performance, a deeper understanding of DRAM technology via characterization can also improve device reliability and security. The... » read more

A Deeper Look into RowHammer’s Sensitivities: Experimental Analysis of Real DRAM Chips and Implications on Future Attacks and Defenses


Abstract "RowHammer is a circuit-level DRAM vulnerability where repeatedly accessing (i.e., hammering) a DRAM row can cause bit flips in physically nearby rows. The RowHammer vulnerability worsens as DRAM cell size and cell-to-cell spacing shrink. Recent studies demonstrate that modern DRAM chips, including chips previously marketed as RowHammer-safe, are even more vulnerable to RowHammer than... » read more

HECTOR-V: A Heterogeneous CPU Architecture for a Secure RISC-V Execution Environment


Summary "To ensure secure and trustworthy execution of applications, vendors frequently embed trusted execution environments into their systems. Here, applications are protected from adversaries, including a malicious operating system. TEEs are usually built by integrating protection mechanisms directly into the processor or by using dedicated external secure elements. However, both of these... » read more

A Compact Model For Scalable MTJ Simulation


Read the full technical paper. Published June 9, 2021. Abstract This paper presents a physics-based modeling framework for the analysis and transient simulation of circuits containing Spin-Transfer Torque (STT) Magnetic Tunnel Junction (MTJ) devices. The framework provides the tools to analyze the stochastic behavior of MTJs and to generate Verilog-A compact models for their simulation in lar... » read more

2D materials–based homogeneous transistor-memory architecture for neuromorphic hardware


Abstract "In neuromorphic hardware, peripheral circuits and memories based on heterogeneous devices are generally physically separated. Thus exploring homogeneous devices for these components is an important issue for improving module integration and resistance matching. Inspired by ferroelectric proximity effect on two-dimensional materials, we present a tungsten diselenide-on-LiNbO3 cascaded... » read more

Hardware Architecture and Software Stack for PIM Based on Commercial DRAM Technology


Abstract: "Emerging applications such as deep neural network demand high off-chip memory bandwidth. However, under stringent physical constraints of chip packages and system boards, it becomes very expensive to further increase the bandwidth of off-chip memory. Besides, transferring data across the memory hierarchy constitutes a large fraction of total energy consumption of systems, and the ... » read more

IChannels: Exploiting Current Management Mechanisms to Create Covert Channels in Modern Processors


Find technical paper link here. Abstract: "To operate efficiently across a wide range of workloads with varying power requirements, a modern processor applies different current management mechanisms, which briefly throttle instruction execution while they adjust voltage and frequency to accommodate for power-hungry instructions (PHIs) in the instruction stream. Doing so 1) reduces the pow... » read more

SpZip: Architectural Support for Effective Data Compression In Irregular Applications


Technical paper link is here. Published in: 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA) Yifan Yang (MIT); Joel Emer (MIT / NVIDIA); Daniel Sanchez (MIT) Abstract: "Irregular applications, such as graph analytics and sparse linear algebra, exhibit frequent indirect, data-dependent accesses to single or short sequences of elements that cause high ma... » read more

Communication Algorithm-Architecture Co-Design for Distributed Deep Learning


"Abstract—Large-scale distributed deep learning training has enabled developments of more complex deep neural network models to learn from larger datasets for sophisticated tasks. In particular, distributed stochastic gradient descent intensively invokes all-reduce operations for gradient update, which dominates communication time during iterative training epochs. In this work, we identify th... » read more

Ten Lessons From Three Generations Shaped Google’s TPUv4i


Source: Norman P. Jouppi, Doe Hyun Yoon, Matthew Ashcraft, Mark Gottscho, Thomas B. Jablin, George Kurian, James Laudon, Sheng Li, Peter Ma, Xiaoyu Ma, Nishant Patil, Sushma Prasad, Clifford Young, Zongwei Zhou (Google); David Patterson (Google / Berkeley) Find technical paper here. 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA) Abstract–"Google de... » read more

← Older posts Newer posts →