AI, GPU, And HPC Data Centers: The Infrastructure Behind Modern AI


Artificial intelligence (AI) is stretching compute infrastructure well beyond what traditional enterprise data centers were designed to handle. Modern AI training requires massively parallel compute, low-latency networking, high-throughput storage pipelines, and facility engineering that can safely support higher rack power densities than legacy environments. These demands are fueling the eme... » read more

Scaling llama.cpp On Neoverse N2: Solving Cross-NUMA Performance Issues


This blog post explains the cross-NUMA memory access issue that occurs when you run llama.cpp in Neoverse. It also introduces a proof-of-concept patch that addresses this issue and can provide up to a 55% performance increase for text generation when you run the llama3_Q4_0 model on the ZhuFeng Neoverse system. Cross-NUMA memory access problem In llama.cpp, performance drops when the number o... » read more

Minimum Energy Per Query


Key Takeaways Extracting heat from a chip faster is a short-term fix to a bigger problem. The longer-term challenge is how to reduce the amount of energy used per query. Data movement, guardbanding, and software inefficiency are key targets for the future. Heat is a serious problem within AI chips, and it is limiting how much processing can be done. The solution is either to... » read more

Modern Trends In Floating-Point


The requirement to support real numbers in computers has existed for as long as computers themselves, yet has always been a more complicated challenge than it at first appears. Why? Because computer-based representations can only represent a finite subset of the continuum of real numbers. Consequently, they can only ever be considered an approximation – thereby demanding a diligent understand... » read more

Six 3D-IC Design Trends That Secure The AI Era


By Pratyush Kamal and Todd Burkholder Greater functionality, performance, and speed are in great demand in pervasive computing, RF, and automotive electronic systems, as well as most everything else. Complexity continues to skyrocket, leading many to say we are officially in the post-Moore’s Law world. In his seminal 1965 paper, “Cramming more components onto integrated circuits,�... » read more

Solving Real-World AI Bottlenecks


The race to build smarter and faster AI chips continues to surge. This is especially true in autonomous vehicles that interpret the world in milliseconds, edge accelerators that push trillions of operations per second, hyperscale data-center processors that drive massive workloads, and next-generation consumer devices that demand ever-higher intelligence. As modern system-on-chip (SoC) architec... » read more

How The EDA Industry Will Evolve In 2026


AI will continue to impact every facet of the EDA industry. Pressure will mount in 2026 on design teams to drive productivity gains while technical complexity continues to escalate. This will reshape how teams work and the tools they use. Success will be determined by balancing the trade-offs between integrated platforms and best-of-breed toolchains and developing talent internally rather than ... » read more

AI’s Impact On Engineering Jobs May Be Different Than Expected


Key Takeaways: AI is expected to eliminate many repetitive, entry-level tasks, but that may allow engineering students trained on the latest tools to start in more senior positions. AI is a force multiplier. It can accelerate the learning curve for junior engineers. While AI is very good at solving multi-dimensional problems, domain expertise, critical thinking, and sanity checks wil... » read more

Balancing Training, Quantization, And Hardware Integration In NPUs


Experts At The Table: AI/ML is driving a steep ramp in neural processing unit (NPU) design activity for everything from data centers to edge devices such as PCs and smartphones. Semiconductor Engineering sat down to discuss this with Jason Lawley, director of product marketing, AI IP at Cadence; Sharad Chole, chief scientist and co-founder at Expedera; Steve Roddy, chief marketing officer at Qu... » read more

GDDR7 Momentum Accelerates As A Key Solution For AI Inference


The AI hardware landscape continues to evolve at a breakneck speed, and memory technology is rapidly becoming a defining differentiator for the next generation of GPUs and AI inference accelerators. When NVIDIA introduced Rubin CPX, its new class of GPU tailored for massive context inference, it underscored a new industry reality: memory throughput and efficiency are now just as critical as ra... » read more

← Older posts Newer posts →