All Software Is Hardware-Dependent


I was lucky in my early career that I found two sets of great mentors. The first happened recently after graduating when I joined the Hilo development team. Members of that team included Phil Moorby, Simon Davdimann, Peter Flake, and others. They all had very different coding personalities, but most importantly, they worked as a team and used good foundational processes. One outcome of that ... » read more

AI Workloads Are Turning The Data Center Network Into A Combined Memory And Storage Fabric


Recent industry trends, including the release of NVIDIA’s Rubin platform (developer.nvidia.com), point to a growing consensus that AI inference is reshaping data center architecture in a fundamental way. As inference workloads become dominant, the data center network is no longer just a communication layer between servers. It is increasingly part of a distributed memory and storage hierarchy,... » read more

Memory Wall Gets Higher


Key Takeaways An increasing percentage of the chip area is consumed by the same amount of SRAM for each node shrink. The problem is not limited to leading-edge AI, as it will eventually impact even small MCUs and MPUs. Architectural changes may be required. Stacking SRAM chiplets on logic is possible but expensive. SRAM is a vital piece of all computing systems, but its fail... » read more

HBM4E Raises The Bar For AI Memory Bandwidth


The pace of AI innovation continues to expose a painful reality. Compute keeps scaling, but memory bandwidth remains one of the hardest bottlenecks to remove. As AI models grow larger and more complex, feeding data fast enough into accelerators has become just as critical as raw compute capability. High Bandwidth Memory (HBM) has been central to solving this challenge, and the next step in that... » read more

AI Inference Needs A Mix-And-Match Memory Strategy


AI inference is no longer a single workload that can be served efficiently by a single type of accelerator or memory. From fast chat replies to 10M token codebases, inference spans wildly diverse workloads with very different limits on latency, bandwidth, capacity, and compute, as the figure below demonstrates.1 Source: Meta1 The AI inference spectrum of workloads includes: Inter... » read more

Oxides Bring Low Leakage Transistors To Leading-Edge Memories


AI workloads need to position more memory that uses less power in ever-closer proximity to computational logic. That overriding imperative is driving new memory designs and new materials exploration across a wide range of applications, including cache memory, working memory, as well as a new category, non-volatile memory used for direct computation. The largest of these, by volume, is workin... » read more

GDDR7 Momentum Accelerates As A Key Solution For AI Inference


The AI hardware landscape continues to evolve at a breakneck speed, and memory technology is rapidly becoming a defining differentiator for the next generation of GPUs and AI inference accelerators. When NVIDIA introduced Rubin CPX, its new class of GPU tailored for massive context inference, it underscored a new industry reality: memory throughput and efficiency are now just as critical as ra... » read more

Reliability Extension Architecture For Cost-Effective HBM (RPI, ScaleFlux, IBM TJ Watson)


A new technical paper titled "Making Strong Error-Correcting Codes Work Effectively for HBM in AI Inference" was published by researchers at Rensselaer Polytechnic Institute, ScaleFlux and IBM T.J. Watson Research Center. Abstract "LLM inference is increasingly memory bound, and HBM cost per GB dominates system cost. Current HBM stacks include short on-die ECC that tightens binning, raise... » read more

TEE.fail: When Your Security System Leaves the Window Open


Let’s talk about a cybersecurity attack that’s been making waves: TEE.fail. TEE stands for Trusted Execution Environment. Sounds reassuring, right? But here’s the kicker: exactly what a TEE is, and what it’s supposed to guarantee, is surprisingly unclear. TEEs have been around for about a decade, but as with many things in security, the rules are more like guidelines. You might think, �... » read more

AI Memory: Enabling The Next Era Of High-Performance Computing


The rapid advancement of artificial intelligence (AI) is driving unprecedented demand for high-performance memory solutions. AI-driven applications are fueling significant year-over-year growth in high-bandwidth memory (HBM). However, as AI models grow in complexity—from large language models (LLMs) to real-time inference applications—the need for faster, higher-bandwidth, and energy-effici... » read more

← Older posts Newer posts →