Hardware-Oriented Analysis of Multi-Head Latent Attention (MLA) in DeepSeek-V3 (KU Leuven)

By Technical Paper Link - 04 Jun, 2025 - Comments: 0

A new technical paper titled "Hardware-Centric Analysis of DeepSeek's Multi-Head Latent Attention" was published by researchers at KU Leuven. Abstract "Multi-Head Latent Attention (MLA), introduced in DeepSeek-V2, improves the efficiency of large language models by projecting query, key, and value tensors into a compact latent space. This architectural change reduces the KV-cache size and s... » read more

HW-Aligned Sparse Attention Architecture For Efficient Long-Context Modeling (DeepSeek et al.)

By Technical Paper Link - 18 Feb, 2025 - Comments: 0

A new technical paper titled "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention" was published by DeepSeek, Peking University and University of Washington. Abstract "Long-context modeling is crucial for next-generation language models, yet the high computational cost of standard attention mechanisms poses significant computational challenges. Sparse attention... » read more

Locking When Emulating Xtensa LX Multi-Core On A Xilinx FPGA

By Nayan Gaywala - 31 Oct, 2024 - Comments: 0

Today's high-performance computing systems often require the designer to instantiate multiple CPU or DSP cores in their subsystem. However, the performance gained by using multiple CPUs comes with additional programming complexity, especially when accessing shared memory data structures and hardware peripherals. CPU cores need to access shared data in an atomic fashion in a multi-core environme... » read more

LLM Inference on GPUs (Intel)

By Technical Paper Link - 02 Feb, 2024 - Comments: 0

A technical paper titled “Efficient LLM inference solution on Intel GPU” was published by researchers at Intel Corporation. Abstract: "Transformer based Large Language Models (LLMs) have been widely used in many fields, and the efficiency of LLM inference becomes hot topic in real applications. However, LLMs are usually complicatedly designed in model structure with massive operations and... » read more

Every Walk’s A Hit: Making Page Walks Single-Access Cache Hits

By Arm - 11 May, 2022 - Comments: 0

As memory capacity has outstripped TLB coverage, large data applications suffer from frequent page table walks. We investigate two complementary techniques for addressing this cost: reducing the number of accesses required and reducing the latency of each access. The first approach is accomplished by opportunistically "flattening" the page table: merging two levels of traditional 4 KB p... » read more

What’s Next In AI, Chips And Masks

By Mark LaPedus - 19 Nov, 2020 - Comments: 0

Aki Fujimura, chief executive of D2S, sat down with Semiconductor Engineering to talk about AI and Moore’s Law, lithography, and photomask technologies. What follows are excerpts of that conversation. SE: In the eBeam Initiative’s recent Luminary Survey, the participants had some interesting observations about the outlook for the photomask market. What were those observations? Fujimur... » read more

Machine Learning Inferencing At The Edge

By Ed Sperling - 12 Sep, 2019 - Comments: 1

Ian Bratt, fellow in Arm's machine learning group, talks about why machine learning inferencing at the edge is so difficult, what are the tradeoffs, how to optimize data movement, how to accelerate that movement, and how it differs from developing other types of processors. » read more

Knowledge Centers
Entities, people and technologies explored

EUV’s Future Looks Even Brighter

Demand for AI chips is growing exponentially, but costs and complexity limit the technology to a handful of companies. That could soon change.

by Gregory Haley

Startup Funding: Q1 2025

AI chips and data center communications see big funding; 75 startups raise $2 billion.

by Jesse Allen

Speeding Up Computational Lithography With The Power And Parallelism Of GPUs

A new lithography library brings mask optimization operations to GPUs.

by Thuc Dam

Advanced Packaging Fundamentals for Semiconductor Engineers

New SE eBook examines the next phase of semiconductor design, testing, and manufacturing.

by Bryon Moyer

Chip Industry Week in Review

AI export rule to be scrapped; SEMI, EU request; Cadence, Nvidia supercomputer; AI co-processor; Imagination's new GPU; semi sales up; imec, TNO photonics lab; NSF key to national security; flexible packaging control system; SiConic test engineering; USB 4 support; SiC JFETS; magnetic behavior in hematite.

by The SE Staff

tag: memory access

Hardware-Oriented Analysis of Multi-Head Latent Attention (MLA) in DeepSeek-V3 (KU Leuven)

HW-Aligned Sparse Attention Architecture For Efficient Long-Context Modeling (DeepSeek et al.)

Locking When Emulating Xtensa LX Multi-Core On A Xilinx FPGA

LLM Inference on GPUs (Intel)

Every Walk’s A Hit: Making Page Walks Single-Access Cache Hits

What’s Next In AI, Chips And Masks

Machine Learning Inferencing At The Edge

Trending Articles

Chip Industry Week in Review

Chip Industry Week in Review

RISC-V’s Increasing Influence

Chip Industry Week in Review

Co-Packaged Optics Reaches Power Efficiency Tipping Point

Knowledge Centers
Entities, people and technologies explored

Related Articles

EUV’s Future Looks Even Brighter

Startup Funding: Q1 2025

Speeding Up Computational Lithography With The Power And Parallelism Of GPUs

Advanced Packaging Fundamentals for Semiconductor Engineers

Chip Industry Week in Review

Chip Industry Week in Review

Linear Pluggable Optics Save Energy In Data Centers

Interconnects Approach Tipping Point

Sponsors

Recent Comments

About

Navigation

Connect With Us

tag: memory access

Hardware-Oriented Analysis of Multi-Head Latent Attention (MLA) in DeepSeek-V3 (KU Leuven)

HW-Aligned Sparse Attention Architecture For Efficient Long-Context Modeling (DeepSeek et al.)

Locking When Emulating Xtensa LX Multi-Core On A Xilinx FPGA

LLM Inference on GPUs (Intel)

Every Walk’s A Hit: Making Page Walks Single-Access Cache Hits

What’s Next In AI, Chips And Masks

Machine Learning Inferencing At The Edge

Trending Articles

Chip Industry Week in Review

Chip Industry Week in Review

RISC-V’s Increasing Influence

Chip Industry Week in Review

Co-Packaged Optics Reaches Power Efficiency Tipping Point

Knowledge Centers Entities, people and technologies explored

Related Articles

EUV’s Future Looks Even Brighter

Startup Funding: Q1 2025

Speeding Up Computational Lithography With The Power And Parallelism Of GPUs

Advanced Packaging Fundamentals for Semiconductor Engineers

Chip Industry Week in Review

Chip Industry Week in Review

Linear Pluggable Optics Save Energy In Data Centers

Interconnects Approach Tipping Point

Sponsors

Newsletter Signup

Popular Tags

Recent Comments

About

Navigation

Connect With Us

Knowledge Centers
Entities, people and technologies explored