Server-Scale Programmable Photonic Fabric to Interconnect Accelerators Within Servers (Cornell University, Lightmatter)


A new technical paper titled "Morphlux: Programmable chip-to-chip photonic fabrics in multi-accelerator servers for ML" was published by researchers at Cornell University and Lightmatter. Abstract "We optically interconnect accelerator chips (e.g., GPUs, TPUs) within compute servers using newly viable programmable chip-to-chip photonic fabrics. In contrast, today, commercial multi-accelerat... » read more

System-Level Design For 1.6 Tbps Interoperability In AI Data Centers


By Madhumita Sanyal and Diwakar Kumaraswamy The rapid escalation of AI/ML workloads—driven by increasingly large language models—is reshaping high-performance computing and AI data center architectures. Real-time inference and large-scale training are pushing the limits of compute and interconnect performance. With model sizes and parameter counts doubling every 4–6 months, infrastruct... » read more

Re-Architecting AI For Power


The industry is becoming increasingly concerned about the amount of power being consumed by AI, but there is no simple solution to the problem. It requires a deep understanding of the application, the software and hardware architectures at both the semiconductor and system levels, and how all of this is designed and implemented. Each piece plays a role in the total power consumed and the utilit... » read more

Maximize Uptime And Improve TCO: RAS And Telemetry In HBM4 For Data Centers


As AI workloads scale and data center operations become increasingly complex, it is critical to keep the infrastructure up and running. Total Cost of Ownership (TCO) is a key metric that includes not only the upfront cost of hardware but also the ongoing expenses of power, cooling, maintenance, and—most importantly—downtime. A single memory failure in a hyperscale AI cluster can cascade int... » read more

The Painful Reality Of Scaling Cloud AI


The shift to Generative AI (GenAI) has overwhelmed existing infrastructure, transforming previously rare issues into daily operational realities. Skyrocketing costs, intense energy consumption, and hardware failures at unprecedented scales illustrate the strain of current AI workloads. With models like GPT-4 costing tens of millions and GPT-5 projected to surpass a billion-dollar threshold, the... » read more

Transforming Test For Co-packaged Optics


Data centers are undergoing a dramatic transformation to reduce the power consumption of high-speed data transmissions by 70% or more with co-packaged optics. By moving optical transceivers from the fronts of racks into the same package as the networking switch and HBMs, AI programs that used to take a week to run can now be completed in a day. To enable this change in production manufacturi... » read more

UEC-LLR: The Future Of Loss Recovery In Ethernet For AI And HPC


As Artificial Intelligence (AI) and High-Performance Computing (HPC) systems become the backbone of modern data centers, they generate and consume a massive amount of data. Traditional Ethernet was not built for such high-bandwidth traffic. In HPCs and AI models, computations are distributed across the nodes and the data is shared in real time with low latency and lossless communication. As ... » read more

For Chip Developers, HW/SW Co-Design Key To Data Center Efficiency


Data centers and high-performance computing (HPC) are the primary enablers of today’s power-hungry AI-driven technology, but chip designers, EDA vendors, and the data centers themselves have a long list of options available to them to help curb AI's power consumption. Chip designers play a critical role in ensuring energy efficient processing from the bottom up, whether that is hardware-so... » read more

Scaling In The AI Era: The Role Of PCI Express 7.0 Switches In Next-Gen Data Centers


As artificial intelligence (AI) workloads continue to scale in complexity and volume, the infrastructure that supports them must evolve just as rapidly. At the heart of this transformation lies PCI Express 7.0 (PCIe 7.0), a next-generation interconnect standard that is redefining how data moves within high-performance computing (HPC) and AI-driven data centers. PCIe 7.0 doubles the raw bit r... » read more

Designing The AI Factories: Unlocking Innovation With Intelligent IP


The rapid evolution of artificial intelligence (AI) is reshaping the technological landscape, driving unprecedented demands on computing infrastructure. At the heart of this transformation lie innovations in intellectual property (IP) that enable scalable, efficient, and performance-driven AI factories. These advancements are central to addressing the technical challenges of modern AI workloads... » read more

← Older posts Newer posts →