AI, GPU, And HPC Data Centers: The Infrastructure Behind Modern AI


Artificial intelligence (AI) is stretching compute infrastructure well beyond what traditional enterprise data centers were designed to handle. Modern AI training requires massively parallel compute, low-latency networking, high-throughput storage pipelines, and facility engineering that can safely support higher rack power densities than legacy environments. These demands are fueling the eme... » read more

UCIe’s Major Technical Components Are Now In Place


Key Takeaways UCIe 3.0 doubles bandwidth and enhances manageability, addressing new use cases and following an annual update cycle since 2023. The growing demand for chiplet-based architectures in AI data centers is driven by the limitations of monolithic chips, making inter-chiplet communication and connectivity crucial. While UCIe was initially seen as feature-heavy, many of its ma... » read more

Power Leadership At 2nm: Foundation IP Optimized For Next-Gen Hyperscale SoCs


By Andrew Appleby and Daryl Seitzer As demand for data center compute accelerates, power efficiency has become the defining metric for modern CPUs, GPUs, and AI accelerators. Every watt saved directly impacts the massive operating costs of gigawatt-scale AI data centers, where power and cooling account for 40–60% of operational expenditures. To reduce energy consumption and strengthen t... » read more

Scaling llama.cpp On Neoverse N2: Solving Cross-NUMA Performance Issues


This blog post explains the cross-NUMA memory access issue that occurs when you run llama.cpp in Neoverse. It also introduces a proof-of-concept patch that addresses this issue and can provide up to a 55% performance increase for text generation when you run the llama3_Q4_0 model on the ZhuFeng Neoverse system. Cross-NUMA memory access problem In llama.cpp, performance drops when the number o... » read more

Minimum Energy Per Query


Key Takeaways Extracting heat from a chip faster is a short-term fix to a bigger problem. The longer-term challenge is how to reduce the amount of energy used per query. Data movement, guardbanding, and software inefficiency are key targets for the future. Heat is a serious problem within AI chips, and it is limiting how much processing can be done. The solution is either to... » read more

Simulate Faster with SimAI Software for High Returns at a Low Cost of Ownership


While the value of engineering simulation has been proven for decades, only a small percentage of engineers report using artificial intelligence (AI) to amplify their simulation results at scale. However, AI has the potential to make large-scale simulations even faster, more precise, and more cost-effective. AI-enabled simulation not only amplifies the performance and returns of simulatio... » read more

Future-Proofing System Design


This whitepaper has explored how converging forces—AI-driven workloads, heterogeneous integration, and increasingly complex security requirements—are transforming design priorities. Adaptability, openness, and lifecycle management are no longer secondary considerations but core architectural imperatives. Standardization through initiatives such as UCIe and OCP fosters interoperability and s... » read more

5 Strategic Decisions for Building a Scalable Compute Platform for Now and the Future


Artificial intelligence (AI) is no longer a “nice-to-have” technology—it’s a central driver of competitive advantage and business innovation. Across industries, enterprises are moving beyond experimentation and embedding AI into all their products, workflows, and customer experiences. But as organizations scale, many are discovering a stark reality: their compute infrastructure was not ... » read more

Research Bits: Feb. 9


Computing with heat Researchers from the Massachusetts Institute of Technology (MIT) designed silicon structures that can perform calculations in an electronic device using excess heat instead of electricity. The device was created using a software system that automatically designs a material that can conduct heat in a specific manner. The inverse design technique allowed the researchers to... » read more

Changes In Chip Architectures At The Edge


Edge computing is all about low latency, within a tight power budget, and with sufficient performance. This is very different from an AI data center, where the real focus is on data throughput between processor and memory. Achieving those goals requires a focus on what different processing elements bring to the table. Nigel Drego, co-founder and CTO of Quadric, talks about how these different c... » read more

← Older posts Newer posts →