New USB Standards: Benefits And Incompatibilities


Just because it's a standard doesn't mean everything will work together, and this is especially evident with conflicting USB standards. David Shin, senior product marketing manager at Cadence Design Systems, explains where incompatibilities can crop up, why it's so difficult to control the different versions, and what's behind all this confusion. » read more

Will Your Chip’s Memory Work As Expected?


Increased density at advanced nodes, multi-die assemblies, and the rollout of AI everywhere are making it much more challenging to ensure that memory will function properly over its expected lifetime. Test is no longer about a single memory or one approach for testing memory. It can vary by application, by workload, and by architecture. Some testing is close to memory, some is built into memory... » read more

Google Details Five Generations Of TPU Training Supercomputers


Researchers from Google and University of California, Berkeley published a technical paper titled “Google’s Training Supercomputers from TPU v2 to Ironwood: Architectural Stability, Scale, Resilience, Power Efficiency, and Sustainability Across Five Generations.” The paper summarizes five generations of Google TPUs, from TPU v2 through Ironwood, and examines how the systems evolved int... » read more

Enhancing High Bandwidth Memory (HBM) Reliability With 3D X-ray Inspection


High Bandwidth Memory (HBM) is revolutionizing AI, high-performance computing, and advanced graphics systems. Its 3D architecture—stacked DRAM dies interconnected via through-silicon vias (TSVs)—delivers exceptional bandwidth and efficiency. But this complexity introduces new challenges for inspection and quality assurance. Why 3D X-ray for HBM? Traditional 2D X-ray imaging cannot fully v... » read more

Four-Tier Memory Hierarchy for LLM Reasoning (USC, UW)


A new technical paper, "Not All Thoughts Need HBM: Semantics-Aware Memory Hierarchy for LLM Reasoning," was published by researchers at USC and University of Wisconsin-Madison. Abstract "Reasoning LLMs produce thousands of chain-of-thought tokens whose KV cache must reside in scarce GPU HBM. The dominant response -- permanently evicting low-importance tokens -- is catastrophic for reasoni... » read more

Flash Getting Stacked High-Bandwidth Version


Key takeaways: A new HBF 3D flash stack is similar to HBM for use in AI processing. HBF capacity will be much higher, allowing static storage of AI model weights, with optimized read speed. Samples are due out later this year, with accelerators featuring it coming out next year. AI inference using modern models requires billions of parameters, and moving them to where they c... » read more

AI Accelerator Testing Depends On DFT Innovations


Key Takeaways: I/O and lane repair capabilities are becoming critical to improving yield. System-level testing catches marginal defects and rare defects such as silent data corruption errors. Synopsys and TSMC developed a multi-die demo vehicle capable of full test, monitor, debug, and repair capability across the system’s lifecycle. The proliferation of accelerators in AI... » read more

HBM Shifts Testing Left To Preserve AI Chip Yield


Key Takeaways: A high-yield, known-good stack requires multiple test insertions. Known good stack testing poses challenges for power delivery and thermal management. The shift to HBM4 and HBM5 will increase the pressure for shift-left test flows. Taller high-bandwidth memory (HBM) stacks and tighter TSV pitch are impacting AI module yields. The solution is to push test furth... » read more

Replacing GPU Compute Dies With PNM-Enabled HBM Cubes For Long-Context Decode Attention (UCSD, Columbia, Yonsei U., NVIDIA, Samsung)


A new technical paper, "AMMA: A Multi-Chiplet Memory-Centric Architecture for Low-Latency 1M Context Attention Serving," was published by researchers at UC San Diego, Columbia University, Yonsei University, NVIDIA, and Samsung. Abstract "All current LLM serving systems place the GPU at the center, from production-level attention-FFN disaggregation to NVIDIA's Rubin GPU-LPU heterogeneous p... » read more

When Semiconductor Materials Misbehave


Key Takeaways Material behavior in production depends on the process context that no development environment can fully replicate. In advanced packaging, the interactions that cross domain boundaries are increasingly where failures originate. The most accurate materials data is also the most commercially sensitive, leaving simulation models calibrated against generic inputs rather tha... » read more

← Older posts