Processing-Using-DRAM: Attaining High-Performance Via Dynamic Precision Bit-Serial Arithmetic (ETH Zurich, et al.)


A new technical paper titled "Proteus: Achieving High-Performance Processing-Using-DRAM via Dynamic Precision Bit-Serial Arithmetic" was published by researchers at ETH Zurich, Cambridge University, Universidad de Córdoba, Univ. of Illinois Urbana-Champaign and NVIDIA Research. Abstract "Processing-using-DRAM (PUD) is a paradigm where the analog operational properties of DRAM structures ... » read more

Apple CPU Attacks: SLAP and FLOP (Georgia Tech, Ruhr University Bochum)


Two technical papers were published by researchers at Georgia Tech and Ruhr University Bochum detailing CPU side-channel attack vulnerabilities on Apple devices that could reveal confidential data. FLOP: Breaking the Apple M3 CPU via False Load Output Predictions"  Authors: Jason Kim, Jalen Chuang, Daniel Genkin and Yuval Yarom 2025. "We present FLOP, another speculative execution att... » read more

Topology And Connection Architecture Of CXL Pooling Systems (Microsoft, Columbia)


A new technical paper titled "Octopus: Scalable Low-Cost CXL Memory Pooling" was published by researchers at University of Washington, Microsoft Azure and Columbia University. Abstract "Compute Express Link (CXL) is widely-supported interconnect standard that promises to enable memory disaggregation in data centers. CXL allows for memory pooling, which can be used to create a shared memory ... » read more

DeepSeek: Improving Language Model Reasoning Capabilities Using Pure Reinforcement Learning


A new technical paper titled "DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning" was published by DeepSeek. Abstract: "We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrates rema... » read more

Analog Accelerator For AI/ML Training Workloads Using Stochastic Gradient Descent (Imperial College London)


A new technical paper titled "Learning in Log-Domain: Subthreshold Analog AI Accelerator Based on Stochastic Gradient Descent" was published by researchers at Imperial College London. Abstract "The rapid proliferation of AI models, coupled with growing demand for edge deployment, necessitates the development of AI hardware that is both high-performance and energy-efficient. In this paper, w... » read more

Parallelized Compilation Pipeline Optimized for Chiplet-Based Quantum Computers


A new technical paper titled "Modular Compilation for Quantum Chiplet Architectures" was published by researchers at Northwestern University. Abstract "As quantum computing technology continues to mature, industry is adopting modular quantum architectures to keep quantum scaling on the projected path and meet performance targets. However, the complexity of chiplet-based quantum devices, cou... » read more

Design-Space Analysis of M3D FPGA With BEOL Configuration Memories (Georgia Tech, UCLA)


A new technical paper titled "Monolithic 3D FPGAs Utilizing Back-End-of-Line Configuration Memories" was published by researchers at Georgia Tech and UCLA. Abstract "This work presents a novel monolithic 3D (M3D) FPGA architecture that leverages stackable back-end-of-line (BEOL) transistors to implement configuration memory and pass gates, significantly improving area, latency, and power ef... » read more

Design Space for the Device-Circuit Codesign of NVM-Based CIM Accelerators (TSMC)


A new technical paper/mini-review titled "Assessing Design Space for the Device-Circuit Codesign of Nonvolatile Memory-Based Compute-in-Memory Accelerators" was published by researchers at TSMC and National Tsing Hua University. Abstract "Unprecedented penetration of artificial intelligence (AI) algorithms has brought about rapid innovations in electronic hardware, including new memory devi... » read more

Geometric-Aware Model Merging Approach To Enhance Instruction Alignment in Chip LLMs (Nvidia)


A new technical paper titled "ChipAlign: Instruction Alignment in Large Language Models for Chip Design via Geodesic Interpolation" was published by researchers at NVIDIA Research. Abstract: "Recent advancements in large language models (LLMs) have expanded their application across various domains, including chip design, where domain-adapted chip models like ChipNeMo have emerged. However, ... » read more

AI Accelerators for Homomorphic Encryption Workloads


A new technical paper titled "Leveraging ASIC AI Chips for Homomorphic Encryption" was published by researchers at Georgia Tech, MIT, Google and Cornell University. Abstract: "Cloud-based services are making the outsourcing of sensitive client data increasingly common. Although homomorphic encryption (HE) offers strong privacy guarantee, it requires substantially more resources than compu... » read more

← Older posts Newer posts →