Tokens-per-watt is now the primary metric driving AI data center optimization.
At the recent Data Center World 2026 in Washington, D.C., one message came through louder than ever: AI infrastructure is scaling faster than any system we’ve built before—and the industry can no longer afford to design it in silos.
The workshop: “More Massive Still! Delivering AI-Driven Scale in the Face of Historic Constraints” captured this perfectly: the industry is shifting from traditional data centers — where IT and facility teams operate in silos, optimizing against each other around competing metrics (uptime and PUE) — to fully integrated “AI factories” where the entire system is judged by one metric: tokens-per-watt.
And that shift changes everything.
The discussion brought together leaders across the ecosystem, including Greg Stover (Vertiv), Al Nichols (Silverback Data Center Solutions), Josh Claman (Accelsius), Kourosh Nemati (NVIDIA), Nathan Mallamace (Supermicro), Rob Curtis (AMD), and Sherman Ikemoto (Cadence Design Systems).
For decades, data center architecture evolved in layers—chips, packages, racks, cooling, power, and facilities—each optimized independently. That model worked when Moore’s Law drove progress.
But as the session highlighted, “now the system is the chip.”
This was reinforced across speakers:
Power density is accelerating rapidly, with AI factories scaling from 100 MW to 1 GW and beyond. Across all perspectives, the conclusion was consistent: the ecosystem must design together, or it will fail to scale.
One of the most powerful concepts discussed was the “stack tax.”
When each layer of the infrastructure is overbuilt independently “for safety,” inefficiencies compound across the system. According to the session data:
The impact is staggering.
A well-optimized, 1 GW NVIDIA Vera Rubin AI factory will operate approximately 300,000 GPUs, while the same AI factory, unoptimized, might reach only 65% of its throughput potential.
This isn’t just an efficiency problem—it’s a business problem. The difference directly translates into token output, revenue, and competitive advantage.
And as AI workloads shift increasingly toward inference at massive scale, this gap will only widen.
NVIDIA described AI infrastructure as a “five-layer cake”: energy, infrastructure, chips, models, and applications. All five must scale together.
That is exactly the challenge the industry faces today.
Even emerging innovations like:
…all require coordination across domains that historically never interacted closely.
This is where the industry must evolve—from connected tools to a connected system design methodology.

Rendering of the SimReady NVIDIA GB300 NVL72 model in the Cadence Reality Digital Twin Platform, powered by NVIDIA Omniverse libraries, demonstrating detailed airflow simulation within an AI factory environment.
This is precisely where Cadence plays a unique role. As multiple speakers acknowledged during the session, Cadence is helping bridge the gap between disciplines—from chip design to facility optimization.
Why does this matter? Because the modern AI factory is not a collection of components—it’s a tightly coupled system.
Cadence enables:
This is not just incremental improvement—it’s a fundamental shift.
In fact, Cadence’s Reality Digital Twin approach demonstrates tangible impact:
That’s the power of eliminating the stack tax.
Another key takeaway from the session was the shift in performance metrics.
The industry is moving toward:
This reframes optimization entirely.
It’s no longer enough to:
Instead, the goal is to maximize end-to-end token output under real-world constraints.
That means:
This is a systems problem—and it demands systems thinking.
The main takeaway from the session was clear: the industry must align around a shared ecosystem.
Key enablers include:
Because no single company can solve this alone.
But together, the industry can build AI infrastructure that is:
We are in the era of the AI factory—where data centers are no longer passive infrastructure, but active systems producing intelligence in real time.
The scale is unprecedented. The constraints are real. And the margin for inefficiency is gone. The only way forward is integration. And that is where Cadence becomes the glue—connecting chips to cooling, power to performance, and design to operations.
Because in the age of AI, the system is the product—and only a unified approach will unlock its full potential.
Leave a Reply