A Compact Model For Scalable MTJ Simulation


Read the full technical paper. Published June 9, 2021. Abstract This paper presents a physics-based modeling framework for the analysis and transient simulation of circuits containing Spin-Transfer Torque (STT) Magnetic Tunnel Junction (MTJ) devices. The framework provides the tools to analyze the stochastic behavior of MTJs and to generate Verilog-A compact models for their simulation in lar... » read more

2D materials–based homogeneous transistor-memory architecture for neuromorphic hardware


Abstract "In neuromorphic hardware, peripheral circuits and memories based on heterogeneous devices are generally physically separated. Thus exploring homogeneous devices for these components is an important issue for improving module integration and resistance matching. Inspired by ferroelectric proximity effect on two-dimensional materials, we present a tungsten diselenide-on-LiNbO3 cascaded... » read more

FORMS: Fine-grained Polarized ReRAM-based In-situ Computation for Mixed-signal DNN Accelerator


Abstract: "Recent work demonstrated the promise of using resistive random access memory (ReRAM) as an emerging technology to perform inherently parallel analog domain in-situ matrix-vector multiplication—the intensive and key computation in deep neural networks (DNNs). One key problem is the weights that are signed values. However, in a ReRAM crossbar, weights are stored as conductance of... » read more

Vector Runahead


Abstract: "The memory wall places a significant limit on performance for many modern workloads. These applications feature complex chains of dependent, indirect memory accesses, which cannot be picked up by even the most advanced microarchitectural prefetchers. The result is that current out-of-order superscalar processors spend the majority of their time stalled. While it is possible to bui... » read more

Hardware Architecture and Software Stack for PIM Based on Commercial DRAM Technology


Abstract: "Emerging applications such as deep neural network demand high off-chip memory bandwidth. However, under stringent physical constraints of chip packages and system boards, it becomes very expensive to further increase the bandwidth of off-chip memory. Besides, transferring data across the memory hierarchy constitutes a large fraction of total energy consumption of systems, and the ... » read more

Efficient Multi-GPU Shared Memory via Automatic Optimization of Fine-Grained Transfers


Harini Muthukrishnan (U of Michigan); David Nellans, Daniel Lustig (NVIDIA); Jeffrey A. Fessler, Thomas Wenisch (U of Michigan). Abstract—"Despite continuing research into inter-GPU communication mechanisms, extracting performance from multiGPU systems remains a significant challenge. Inter-GPU communication via bulk DMA-based transfers exposes data transfer latency on the GPU’s critical... » read more

PF-DRAM: A Precharge-Free DRAM Structure


Authors: Nezam Rohbani † (IPM); Sina Darabii § (Sharif); Hamid Sarbazi-Azad † i §(Sharif / IPM): † School of Computer Science, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran § Department of Computer Engineering, Sharif University of Technology, Tehran, Iran Abstract: "Although DRAM capacity and bandwidth have increased sharply by the advances in technology ... » read more

A Novel PUF Using Stochastic Short-Term Memory Time of Oxide-Based RRAM for Embedded Applications


Abstract: "RRAM suffers from poor retention with short-term memory time when using low compliance current for programing. However, the short-term memory time exhibits ideal randomness, which can be exploited as an entropy source for physically unclonable function (PUF). In this work, we demonstrated a novel PUF utilizing the stochastic short-term memory time of oxide-based RRAM. The proposed P... » read more

A Novel Complementary Architecture of One-time-programmable Memory and Its Applications as Physical Unclonable Function (PUF) and One-time Password


Abstract "For the first time, we proposed a 2T complementary architecture of one-time-programmable memory (OTP) in a foundry logic CMOS chip. It was then used to realize the PUF (Physical unclonable function), and the combination with the AI technology to provide a one-time password capability. At first, an OTP was developed based on a novel 2T CMOS unit cell. The experimental results show t... » read more

A Machine-Learning-Resistant 3D PUF with 8-layer Stacking Vertical RRAM and 0.014% Bit Error Rate Using In-Cell Stabilization Scheme for IoT Security Applications


Abstract: "In this work, we propose and demonstrate a multi-layer 3-dimensional (3D) vertical RRAM (VRRAM) PUF with in-cell stabilization scheme to improve both cost efficiency and reliability. An 8-layer VRRAM array was manufactured with excellent uniformity and good endurance of >10 7 . Apart from the variation in RRAM resistance, enhanced randomness is obtained thanks to the parasitic IR... » read more

← Older posts Newer posts →