Evaluation of LLMs on HDL-Based Communication Protocol Generation (U. of Illinois Urbana, CISPA)


A new technical paper titled "ProtocolLLM: RTL Benchmark for SystemVerilog Generation of Communication Protocols" was published by researchers at University of Illinois Urbana Champaign and CISPA Helmholtz Center for Information Security. Abstract "Recent advances in Large Language Models (LLMs) have shown promising capabilities in generating code for general-purpose programming languages. ... » read more

Open-Source RISC-V Cores: Analysis Of Scalar and Superscalar Architectures And Out-Of-Order Machines


A new technical paper titled "Ramping Up Open-Source RISC-V Cores: Assessing the Energy Efficiency of Superscalar, Out-of-Order Execution" was published by researchers at ETH Zurich, Università di Bologna and Univ. Grenoble Alpes, Inria. Abstract "Open-source RISC-V cores are increasingly demanded in domains like automotive and space, where achieving high instructions per cycle (IPC) throu... » read more

Hardware-Oriented Analysis of Multi-Head Latent Attention (MLA) in DeepSeek-V3 (KU Leuven)


A new technical paper titled "Hardware-Centric Analysis of DeepSeek's Multi-Head Latent Attention" was published by researchers at KU Leuven. Abstract "Multi-Head Latent Attention (MLA), introduced in DeepSeek-V2, improves the efficiency of large language models by projecting query, key, and value tensors into a compact latent space. This architectural change reduces the KV-cache size and s... » read more

Customizing An LLM Tailored Specifically For VHDL Code And Design Of High Performance Processors (IBM)


A new technical paper titled "Customizing a Large Language Model for VHDL Design of High-Performance Microprocessors" was published by researchers at IBM. Abstract "The use of Large Language Models (LLMs) in hardware design has taken off in recent years, principally through its incorporation in tools that increase chip designer productivity. There has been considerable discussion about the ... » read more

Arithmetic Intensity In Decoding: A Hardware-Efficient Perspective (Princeton University)


A new technical paper titled "Hardware-Efficient Attention for Fast Decoding" was published by researchers at Princeton University. Abstract "LLM decoding is bottlenecked for large batches and long contexts by loading the key-value (KV) cache from high-bandwidth memory, which inflates per-token latency, while the sequential nature of decoding limits parallelism. We analyze the interplay amo... » read more

Roadmap for AI HW Development, With The Role of Photonic Chips In Supporting Future LLMs (CUHK, NUS, UIUC, Berkeley)


A new technical paper titled "What Is Next for LLMs? Next-Generation AI Computing Hardware Using Photonic Chips" was published by researchers at The Chinese University of Hong Kong, National University of Singapore, University of Illinois Urbana-Champaign and UC Berkeley. Abstract "Large language models (LLMs) are rapidly pushing the limits of contemporary computing hardware. For example, t... » read more

SRAM Cell Scaling With Monolithic 3D Integration Of 2D FETs (Penn State)


A new technical paper titled "Enabling static random-access memory cell scaling with monolithic 3D integration of 2D field-effect transistors" was published by researchers at The Pennsylvania State University. Abstract "Static Random-Access Memory (SRAM) cells are fundamental in computer architecture, serving crucial roles in cache memory, buffers, and registers due to their high-speed perf... » read more

Chiplet-to-Chiplet Gateway Architecture, A C2C Interface Bridging Two Chiplet Protocols (Peter Grünberg, Jülich Supercomputing Centre)


A new technical paper titled "Modeling Chiplet-to-Chiplet (C2C) Communication for Chiplet-based Co-Design" was published by researchers at Peter Grünberg Institute and Jülich Supercomputing Centre. Abstract "Chiplet-based processor design, which combines small dies called chiplets to form a larger chip, enables scalable designs at economical costs. This trend has received high attention s... » read more

Offline RL Framework That Dynamically Controls The GPU Clock And Server Fan Speed To Optimize Power Consumption And Computation Time (KAIST)


A new technical paper titled "Power Consumption Optimization of GPU Server With Offline Reinforcement Learning" was published by researchers at Korea Advanced Institute of Science and Technology (KAIST) and KT Research and Development Center. "Optimizing GPU server power consumption is complex due to the interdependence of various components. Conventional methods often involve trade-offs: in... » read more

Electrical Properties of ML and BL MoS2 GAA NS FETs With Source/Drain Metal Contacts (NYCU)


A new technical paper titled "Electrical Characteristics of ML and BL MoS2 GAA NS FETs With Source/Drain Metal Contacts" was published by researchers at National Yang Ming Chiao Tung University. Abstract "This paper reports source/drain (S/D) contact issues in monolayer and bilayer (BL) MoS2 devices through density-functional-theory (DFT) calculation and device simulation. We begin by ana... » read more

← Older posts Newer posts →