Near-memory Dequantization Architecture In Custom HBM for LLM inference (SK hynix)

By Technical Paper Link - 13 Jul, 2026 - Comments: 0

Researchers from SK hynix published a technical paper titled “StreamDQ: Near-Memory Weight DeQuantization in Custom HBM for Scalable AI Inference Acceleration.” The paper proposes StreamDQ for “a lightweight architectural enhancement that enables on-the-fly dequantization in the memory subsystem for high-throughput, large-batch LLM inference,” and reports “up to 7.08× speedup and ... » read more

Microarchitecture Tailored to 3D-Stacked Near-Memory Processing LLM Decoding (U. of Edinburgh, Peking U., Cambridge et al.)

By Technical Paper Link - 28 Apr, 2026 - Comments: 0

A new technical paper, "Rethinking Compute Substrates for 3D-Stacked Near-Memory LLM Decoding: Microarchitecture-Scheduling Co-Design," was published by researchers at University of Edinburgh, Peking University, University of Cambridge, University of Chinese Academy of Sciences, and the Hong Kong University of Science and Technology. Abstract "Large language model (LLM) decoding is a majo... » read more

A New Breed Of Engineer

By Brian Bailey - 27 Feb, 2020 - Comments: 6

The industry loves to move in straight lines. Each generation of silicon is more-or-less a linear extrapolation of what came before. There are many reasons for this – products continue to evolve within the industry, adding new or higher performance interfaces, risk levels are lower when the minimum amount is changed for any chip spin, existing software is more likely to run with only minor mo... » read more

tag: near memory processing

Near-memory Dequantization Architecture In Custom HBM for LLM inference (SK hynix)

Microarchitecture Tailored to 3D-Stacked Near-Memory Processing LLM Decoding (U. of Edinburgh, Peking U., Cambridge et al.)

A New Breed Of Engineer

Trending Articles

Chip Industry Week In Review

Chip Industry Week In Review

Executive Outlook: Agentic AI’s Impact On Chip Design

I/O Design Challenges Grow In AI Data Centers And HPC Clusters

Data Center AI Growth Faces Challenging Bottlenecks

Knowledge Centers
Entities, people and technologies explored

Related Articles

Startup Funding: Q1 2026

All AI Data Center Interconnects Will Be Optical Within 5 Years

The Sub-2nm Paradox

When Semiconductor Materials Misbehave

TSMC Tech Symposium 2026, By The Numbers

Silicon Photonics Lights The Way To More Efficient Data Centers

Memory Wall Gets Higher

TSV Complexity Leads To Manufacturing Bottleneck

Sponsors

Recent Comments

About

Navigation

Connect With Us

tag: near memory processing

Near-memory Dequantization Architecture In Custom HBM for LLM inference (SK hynix)

Microarchitecture Tailored to 3D-Stacked Near-Memory Processing LLM Decoding (U. of Edinburgh, Peking U., Cambridge et al.)

A New Breed Of Engineer

Trending Articles

Chip Industry Week In Review

Chip Industry Week In Review

Executive Outlook: Agentic AI’s Impact On Chip Design

I/O Design Challenges Grow In AI Data Centers And HPC Clusters

Data Center AI Growth Faces Challenging Bottlenecks

Knowledge Centers Entities, people and technologies explored

Related Articles

Startup Funding: Q1 2026

All AI Data Center Interconnects Will Be Optical Within 5 Years

The Sub-2nm Paradox

When Semiconductor Materials Misbehave

TSMC Tech Symposium 2026, By The Numbers

Silicon Photonics Lights The Way To More Efficient Data Centers

Memory Wall Gets Higher

TSV Complexity Leads To Manufacturing Bottleneck

Sponsors

Newsletter Signup

Popular Tags

Recent Comments

About

Navigation

Connect With Us

Knowledge Centers
Entities, people and technologies explored