Formal Verification Fundamentals Remain Non-Negotiable In The New Verification Revolution


The semiconductor industry stands at a critical juncture. First-time silicon success rates have reached all-time lows, while design complexity continues to grow exponentially. System-on-chip designs now integrate billions of transistors, multiple processor cores, complex memory hierarchies, and sophisticated interconnect fabrics. In this environment, the stakes for verification accuracy have ne... » read more

AI, GPU, And HPC Data Centers: The Infrastructure Behind Modern AI


Artificial intelligence (AI) is stretching compute infrastructure well beyond what traditional enterprise data centers were designed to handle. Modern AI training requires massively parallel compute, low-latency networking, high-throughput storage pipelines, and facility engineering that can safely support higher rack power densities than legacy environments. These demands are fueling the eme... » read more

Power Leadership At 2nm: Foundation IP Optimized For Next-Gen Hyperscale SoCs


By Andrew Appleby and Daryl Seitzer As demand for data center compute accelerates, power efficiency has become the defining metric for modern CPUs, GPUs, and AI accelerators. Every watt saved directly impacts the massive operating costs of gigawatt-scale AI data centers, where power and cooling account for 40–60% of operational expenditures. To reduce energy consumption and strengthen t... » read more

Scaling llama.cpp On Neoverse N2: Solving Cross-NUMA Performance Issues


This blog post explains the cross-NUMA memory access issue that occurs when you run llama.cpp in Neoverse. It also introduces a proof-of-concept patch that addresses this issue and can provide up to a 55% performance increase for text generation when you run the llama3_Q4_0 model on the ZhuFeng Neoverse system. Cross-NUMA memory access problem In llama.cpp, performance drops when the number o... » read more

Engineering After Orthogonalization: Why Verification Has Become A Lifecycle Discipline


Over the past several decades studying verification practices across the semiconductor industry, I’ve watched assumptions that once held up remarkably well begin to strain under the weight of modern system complexity. This is not a loss of engineering rigor. It is the result of systems that no longer conform to the boundaries earlier design models depended on. For much of the industry’s ... » read more

The Design Challenges Of Clock Integrity And Clock Jitter


Signal integrity is one of the many challenges faced by chip designers. Deep submicron technologies are unfriendly hosts for the nice, clean signals desired. The culprits that compromise signal integrity and introduce jitter include thermal effects, manufacturing flaws, signal crosstalk, IR (voltage) drop, signal loss over long runs, reflections, electromagnetic interference (EMI), ground bounc... » read more

GDDR7 Momentum Accelerates As A Key Solution For AI Inference


The AI hardware landscape continues to evolve at a breakneck speed, and memory technology is rapidly becoming a defining differentiator for the next generation of GPUs and AI inference accelerators. When NVIDIA introduced Rubin CPX, its new class of GPU tailored for massive context inference, it underscored a new industry reality: memory throughput and efficiency are now just as critical as ra... » read more

Exploring The Latest Innovations In MIPI D-PHY And MIPI C-PHY


By Michael Nagib and Nuno Martins In the ever-evolving landscape of high-performance camera and display technologies, MIPI D-PHY and MIPI C-PHY specifications continue to lead the charge, setting benchmarks for low power, low latency, and high bandwidth data transmission. Building on the insights from our previous article, “Demystifying MIPI C-PHY/D–PHY Subsystem” – we now delve into... » read more

Smarter Write Barriers For Arm64 In .NET CoreCLR


Last year, I explored how you can use the Arm Scalable Vector Extension (SVE) in .NET to unlock SIMD performance at scale. This year, my focus has shifted to something less visible but just as fundamental to runtime performance. Write barriers in the CoreCLR garbage collector (GC). Write barriers are not a feature most .NET developers ever think about. They do not change how you write C# cod... » read more

Whale-Inspired Propulsion System To Reduce Operating Costs By 20%


This article is an excerpt from the presentation delivered by Bluefins at CadenceCONNECT CFD 2025. Shipping is essential to global trade, as it transports nearly 90% of all traded goods. Large ships consume between 20 to 70 tons of fuel daily, which translates to approximately 15 million euros in annual fuel costs. This level of fuel consumption results in emissions of up to 75,000 tons of C... » read more

← Older posts Newer posts →