System-Level Design For 1.6 Tbps Interoperability In AI Data Centers


By Madhumita Sanyal and Diwakar Kumaraswamy The rapid escalation of AI/ML workloads—driven by increasingly large language models—is reshaping high-performance computing and AI data center architectures. Real-time inference and large-scale training are pushing the limits of compute and interconnect performance. With model sizes and parameter counts doubling every 4–6 months, infrastruct... » read more

Re-Architecting AI For Power


The industry is becoming increasingly concerned about the amount of power being consumed by AI, but there is no simple solution to the problem. It requires a deep understanding of the application, the software and hardware architectures at both the semiconductor and system levels, and how all of this is designed and implemented. Each piece plays a role in the total power consumed and the utilit... » read more

Maximize Uptime And Improve TCO: RAS And Telemetry In HBM4 For Data Centers


As AI workloads scale and data center operations become increasingly complex, it is critical to keep the infrastructure up and running. Total Cost of Ownership (TCO) is a key metric that includes not only the upfront cost of hardware but also the ongoing expenses of power, cooling, maintenance, and—most importantly—downtime. A single memory failure in a hyperscale AI cluster can cascade int... » read more

UEC-CBFC: Credit-Based Flow Control For Next-Gen Ethernet In AI And HPC


For ages, Ethernet has been the backbone of networking — starting from simple web browsing to cloud computing, data centers, automobiles, and more. Ethernet has enabled countless innovations, and now, it's expanding to meet the demands of AI and HPC. As the world shifts toward these new technologies, new challenges are emerging. These include increased scale, higher bandwidth density, mult... » read more

Reliable Training Data Paramount To AI Model Success


AI systems are increasingly being integrated into safety- and mission-critical applications ranging from automotive to health care and industrial IoT, stepping up the need for training data that is reliable, secure, and which is generated from trusted sources. AI activity is growing exponentially, as everybody tries to figure out how to apply it to their domain, application, or workload. In ... » read more

Start Experimenting With Neural Super Sampling For Mobile Graphics


Mobile game developers around the world face increasing pressure to meet user expectations for sharper visuals, smoother gameplay, and longer battery life. Balancing these goals on constrained mobile devices often means making trade-offs. Traditional upscaling methods offer limited flexibility. Real-time AI rendering remains complex, power-hungry, or hardware dependent. Neural Super Sampling... » read more

Workload-Specific Hardware Accelerators


Workload-specific hardware accelerators are becoming essential in large data centers for two reasons. One is that general-purpose processing elements cannot keep up with the workload demands or latency requirements. The second is that they need to be extremely efficient due to limited electricity from the grid and the high cost of cooling these devices. Sharad Chole, chief scientist and co-foun... » read more

Building An AI Chip: Pre Silicon Planning


This white paper highlights the challenges of AI chip design, including balancing performance, cost, and power efficiency. It emphasizes the importance of early architecture exploration to avoid costly design revisions and ensure optimal power-performance trade-offs. The paper underscores the need for secure, efficient, and scalable IP solutions to meet the evolving demands of AI applications, ... » read more

The Painful Reality Of Scaling Cloud AI


The shift to Generative AI (GenAI) has overwhelmed existing infrastructure, transforming previously rare issues into daily operational realities. Skyrocketing costs, intense energy consumption, and hardware failures at unprecedented scales illustrate the strain of current AI workloads. With models like GPT-4 costing tens of millions and GPT-5 projected to surpass a billion-dollar threshold, the... » read more

What’s Different About HBM4


Memory bandwidth is limiting the flow of huge datasets that are needed to train AI models. There is much more data to process, store, and retrieve, but the speed at which that data moves through high-bandwidth memory (HBM) stacks is significantly lower than the speed at which data can be processed. Frank Ferro, group director for product management at Cadence, talks about the new HBM4 standard,... » read more

← Older posts Newer posts →