Chip Industry Technical Paper Roundup: Feb. 25


New technical papers recently added to Semiconductor Engineering’s library: [table id=408 /] Find all technical papers here. » read more

HW-Aligned Sparse Attention Architecture For Efficient Long-Context Modeling (DeepSeek et al.)


A new technical paper titled "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention" was published by DeepSeek, Peking University and University of Washington. Abstract "Long-context modeling is crucial for next-generation language models, yet the high computational cost of standard attention mechanisms poses significant computational challenges. Sparse attention... » read more

Vision Language Models Come Rushing In


Just when you thought the pace of change of AI models couldn’t get any faster, it accelerates yet again. In the popular news media, the introduction of DeepSeek in January 2025 created a moment that captured headlines in every newspaper and website heralding comparisons to the Sputnik moment of 1957. But rapid change is also happening in many quarters that are hidden from view of the Chat-App... » read more

Chip Industry Week In Review


The chip industry is well on its way to hit $1 trillion in revenue by the end of its decade. Several analyst firms released 2024 annual results and 2025 predictions: Worldwide semiconductor revenue reached $626 billion in 2024, an 18% increase versus 2023, according to preliminary Gartner report. Memory revenue grew about 70%  2024 versus 2023. The firm forecasts that HBM will make up 19%... » read more

Chip Industry Week In Review


Chinese startup DeepSeek rattled the tech world and U.S. stock market with claims it spent just $5.6 million on compute power for its AI model compared to its billion-dollar rivals in the U.S. The announcement raised questions about U.S. investment strategies in AI infrastructure and led to an initial $600 billion selloff of NVIDIA stock. Since its launch, DeepSeek reportedly was hit by malicio... » read more

AI Infrastructure At A Crossroads


By Ramin Farjadrad and Syrus Ziai There is a big push to achieve greater scale, performance and sustainability to fuel the AI revolution. More speed, more memory bandwidth, less power — these are the holy grails. Naturally, the one-two punch of StarGate and DeepSeek last week has raised many questions in our ecosystem and with our various stakeholders. Can DeepSeek be real? And if so, w... » read more

DeepSeek: Improving Language Model Reasoning Capabilities Using Pure Reinforcement Learning


A new technical paper titled "DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning" was published by DeepSeek. Abstract: "We introduce our first-generation reasoning models, DeepSeek-R1-Zero and DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrates rema... » read more