Author's Latest Posts


Improving GPU Energy Efficiency With Component-Level Power Management (AMD)


Researchers from AMD released “CompPow: A Case for Component-level GPU Power Management”. Abstract “The ever increasing demand for ML-driven intelligence in a wide spectrum of domains has led to ubiquity of GPUs. At the same time, GPUs are notorious for their power consumption needs and often dominate power allocation in a typical ML datacenter. While datacenter-level power opti... » read more

Large-scale, SRAM-based LLM Inference Deployment (Groq)


A new technical paper, "SHIP: SRAM-Based Huge Inference Pipelines for Fast LLM Serving," was published by researchers at Nvidia, with work done while at Groq. Abstract "The proliferation of large language models (LLMs) demands inference systems with both low latency and high efficiency at scale. GPU-based serving relies on HBM for model weights and KV caches, creating a memory bandwidth b... » read more

Four-Tier Memory Hierarchy for LLM Reasoning (USC, UW)


A new technical paper, "Not All Thoughts Need HBM: Semantics-Aware Memory Hierarchy for LLM Reasoning," was published by researchers at USC and University of Wisconsin-Madison. Abstract "Reasoning LLMs produce thousands of chain-of-thought tokens whose KV cache must reside in scarce GPU HBM. The dominant response -- permanently evicting low-importance tokens -- is catastrophic for reasoni... » read more

A Deionized Water-Based Large-Scale Transfer Process For 2D Materials Grown on Sapphire (AMO, RWTH, Aixtron)


A new technical paper, "Water-based, large-scale transfer of 2D materials grown on sapphire substrates," was published by researchers at AMO GmbH, RWTH Aachen University, and AIXTRON SE. Abstract "Two-dimensional materials (2DMs) hold significant potential for future electronics, as demonstrated by high-performing devices for sensing, optics, and electronics. However, scalable growth tech... » read more

Evaluating and Calibrating Performance On RISC-V Vector Processors (KTH, LLNL, BSC)


A new technical paper, "Closer in the Gap: Towards Portable Performance on RISC-V Vector Processors," was published by researchers at KTH Royal Institute of Technology, Lawrence Livermore National Laboratory, and Barcelona Supercomputing Center. Abstract "The RISC-V Vector Extension~(RVV) is a cornerstone for supporting compute throughout in scientific and machine learning workloads. Yet ... » read more

Scalable Photomask Optimization With Morphological Learning (SUNY Buffalo, VU, IBM)


A new technical paper, "MorphOPC: Advancing Mask Optimization with Multi-scale Hierarchical Morphological Learning," was published by researchers at University at Buffalo, Villanova University, and IBM T. J. Watson Research Center. Abstract "As feature sizes shrink to the nanometer scale, accurately transferring circuit patterns from photomasks to silicon wafers becomes increasingly chall... » read more

Workflow-Level Design For Trustworthy GenAI Integration in Vehicles (UOL, Denso)


A new technical paper, "Workflow-Level Design Principles for Trustworthy GenAI in Automotive System Engineering," was published by researchers at University of Oldenburg and Denso Automotive. Abstract "The adoption of large language models in safety-critical system engineering is constrained by trustworthiness, traceability, and alignment with established verification practices. We propos... » read more

HW-Native, GPU Compiler for Large-scale ML Production Systems (UC San Diego, Meta)


A new technical paper, "TLX: Hardware-Native, Evolvable MIMW GPU Compiler for Large-scale Production Environments," was published by researchers at UC San Diego and Meta. Abstract "Modern GPUs increasingly rely on specialized hardware units and asynchronous coordination mechanisms, so performance depends on orchestrating data movement, tensor-core computation, and synchronization rather t... » read more

Micro-Transfer Printing (MTP) As A Promising Scalable Approach to Heterogeneous Integration for Silicon Photonics (Ghent U., imec et al)


A new technical paper, "Micro-Transfer Printing on Silicon Photonics: Tutorial, Recent Progress and Outlook," was published by researchers at Ghent University, imec et al. Abstract "This paper highlights micro-transfer printing (MTP) as a promising scalable approach to heterogeneous integration for silicon photonics. MTP uniquely achieves high integration density, high throughput, and hig... » read more

A Comprehensive Study Of Integrating 2D Materials With CFET Architecture (SKKU, et al.)


A new technical paper, "Challenges and prospects of 2D electronics for future monolithic complementary field-effect transistors," was published by researchers at Sungkyunkwan University, Hanyang University, Istituto Italiano di Tecnologia, Shanghai University, Jeonbuk National University, and Kyonggi University. Abstract "With planar complementary metal-oxide-semiconductor (CMOS) scaling ... » read more

← Older posts Newer posts →