Unlocking the full potential of AI requires innovation from algorithms and architecture to foundational silicon technologies.
By Pushkar Apte, Jim Sexton, and Melissa Grupen-Shemansky
The world is abuzz with the new opportunities being created by artificial intelligence (AI), enabled by the availability of unprecedented amounts of data. AI runs on the semiconductor engine, and in turn, creates a rising demand for semiconductor chips. The semiconductor industry is predicted to reach $1 trillion in revenue by 2030 by McKinsey & Co., in large part due to the market demand for AI and data. There are, however, formidable challenges to overcome for this virtuous cycle to continue. The SEMI Smart Data-AI Initiative, together with the SEMI Future of Computing Think Tank, is working to help the industry address these challenges.
“To unlock the full potential of AI, innovation is required across the technology stack – from the models and software to data center architecture, chip design and how those chips are made,” said Gary Dickerson, president and CEO of Applied Materials. “Advancements in foundational semiconductor technologies will have a dramatic impact on system-level energy and cost reduction in the AI data center.”
Investment in AI system infrastructure is rising at a dizzying pace, with hundreds of billions of dollars being committed by individual companies as well as public-private partnerships around the world. AI models built on larger data sets generally deliver better results, so model sizes are growing exponentially each year, with leading-edge models requiring billions and even trillions of parameters. This is especially true with the rapid growth of the Large Language Models (LLMs) used for Generative AI.
Can the foundational semiconductor technology keep up? Even if semiconductor chips were following the famous Moore’s Law, performance would only double every 2 years. The real pace of performance improvement is even slower as leading-edge technologies are reaching the physical limits of materials – with the tiniest patterned dimensions on chips now approaching the fundamental atomic separation distance. While semiconductor designers and process technologists continue to innovate with new materials, devices, 3-dimensional stacking, and so forth, there remains a formidable challenge for silicon chips and hardware systems to keep up with the growth rate of AI models and data sets.
Processing ever-larger data sets and AI models also requires increasing energy. A recent report by the US Department of Energy indicates that data center energy consumption tripled over the past decade and may triple again in just 5 years! Other analyses show that a single data center powered by 20,000 GPUs can consume almost 40,000 KW, which is enough to power 31,000 homes in the US! Consequently, it is challenging for data centers to meet their power needs through public utilities, and several hyper scalers are investing in nuclear power.
This acceleration in AI energy demand is further exacerbated because silicon technology evolution no longer follows power scaling with “Dennard’s Law,” which states that power density remains constant as technology scales to tinier dimensions. In fact, energy consumption of silicon devices has been increasing with technology scaling for the last decade. These combined factors give rise to the second formidable challenge – energy consumption is rising unsustainably for AI systems.
Addressing these challenges requires innovation from algorithms and architecture to foundational silicon technologies. The following are illustrative examples (not comprehensive) spanning the entire AI system stack.
At the software and algorithm level, innovators are finding ways to reduce model size and to use hardware more efficiently. For example, IBM’s Granite models are smaller in size, with less than a billion parameters. Similarly, Google’s Gemma platform offers small language models (SLMs). The recent market disruption from the publication of the DeepSeek reasoning model suggests that relatively smaller domain-specific reasoning models may offer significant efficiencies.
At the architectural level, multiple paths are being explored. Special-purpose (or domain-specific) processing elements can deliver improved performance at equal or lower power for specific tasks. Examples include Cerebras’ wafer-scale designs with optimized AI accelerators and Mueon’s system-scale integration solutions.
Another innovation path focuses on bringing computing closer to the memory elements, where the data resides. This addresses the major bottleneck between processors and memory in the traditional Von Neumann architecture, which has been the mainstay of the industry since inception. In-memory or near-memory computing, such as memory-focused architectures from Micron or processor-in-memory (PIM) solutions from SK hynix, offer higher performance with lower energy consumption for certain workloads. In parallel, leading CPU and GPU makers like AMD, Intel, and NVIDIA continue to innovate with power-efficient solutions. And “Edge Intelligence” innovations – for example, internet-of-things (IoT) solutions from Arm and Qualcomm – help reduce the processing and power load on data centers by executing more operations on edge devices.
Critical enabling technologies also contribute significantly. Advanced packaging, for example ASE’s heterogeneous integration solutions, enable efficient, high-performance computing by integrating multiple diverse components optimally. Another emerging development is the advent of “chiplets,” which split the chip into smaller parts, and enable special-purpose accelerator building blocks to be assembled with more general processor, memory, and interconnect elements. A well-developed chiplet ecosystem could provide silicon designers with more degrees of freedom to design optimized systems.
Looking beyond electronics, the integration of photonics can enable low-power, high-bandwidth connectivity – for example, LightMatter’s silicon photonics interconnects and Ciena’s data center interconnects.
Materials and devices form the foundation of the technology stack. Example technology innovations include Stanford University-led N3XT, a 3D solution that integrates multiple novel devices and materials including resistive and spin-torque transfer RAMs, carbon nanotubes and 2D materials. Similarly, a University of California-led effort synthesizes low-dimensional nanostructures, sensors, detectors and photonics in an integrated solution. Finally, advanced and innovative processes and equipment are being developed – for example, by Applied Materials and Lam Research – to fabricate these novel materials and devices.
All these individual innovations are amazing and necessary, but are they sufficient? What if we could collaborate across the entire system and co-optimize hardware and software innovations synergistically? Could the integrated whole be greater than the sum of parts? What efficiencies could we unleash? And what business opportunities would this unlock?
SEMI seeks to answer these questions by uniting AI innovation leaders from industry, academia and start-ups, including most of the companies and universities mentioned in this article. We will begin building pre-competitive collaboration that breaks through silos and explores system-level solutions – with the ultimate objective of radically improving the energy-efficiency of computing for AI.
Jim Sexton is a Fellow at IBM.
Melissa Grupen-Shemansky is CTO and VP of Technology Communities at SEMI.
Leave a Reply