Addressing energy consumption has become a requirement as AI takes root, but it requires changes across the entire ecosystem.
The massive power needs of AI systems are putting a spotlight on sustainability in the semiconductor ecosystem. The chip industry needs to be able to produce more efficient and lower-power semiconductors. But demands for increased processing speed are rising with the widespread use of large language models and the overall increase in the amount of data that needs to be processed. Gartner estimates that 50% of organizations will adopt sustainability-enabled monitoring by 2026 to manage energy consumption and carbon footprint metrics for their hybrid cloud environments. This is necessary, given that 40% of data centers are likely to be power-constrained by 2027, according to the firm.
Semiconductors can help with sustainability if they are designed correctly. That’s true even for AI/ML chips designed for maximum performance, providing they can get computations done more quickly using sparser algorithms. That results in an overall reduction in the amount of energy consumed. This is where custom accelerators come into play. They can provide significant improvements compared with general-purpose processors, often working in tandem.
AI’s growing footprint
Computing is happening everywhere, and all of it needs to be more efficient. Smart cities, smart infrastructure, and smart transportation are not possible without smart technology, and increasingly that is enabled by AI. As AI becomes more entrenched, the semiconductor ecosystem is trying to minimize its impact on resources.
Today’s data centers already consume enormous amounts of power. Globally, 460 terawatt-hours (TWh) of electricity are needed annually. That’s equivalent to the entire amount of energy produced by Germany. In the United States, data center electricity consumption was 2.5% of the U.S. total (~130 TWh) in 2022, and that is expected to triple to 7.5% (~390 TWh) by 2030, according to the Boston Consulting Group. That is the equivalent of the electricity used by about 40 million U.S. houses, or nearly a third of the total homes in the U.S.
“AI has the potential to exceed all the transformative innovations created in the past century,” said Imran Yusuf, director of hardware ecosystems in Arm’s Infrastructure Group. “But the power demands of AI compute are significant, and as companies rush to build out their AI capabilities, they risk outpacing their own sustainability targets. Future AI models will continue to become larger and smarter, fueling the need for more compute, which increases demand for power as part of a virtuous cycle. Simply put, no electricity, no AI. How do we balance the need for electricity with the need to continue fueling the AI revolution? By finding ways to reduce the power requirements for large data centers and AI compute systems.”
Others agree. “It comes down to being more efficient in the design,” said Neil Hand, director of marketing at Siemens EDA. “You can look at any algorithm. There are orders of efficiency. The most inefficient approach from a power perspective is to run the raw code on a general-purpose processor and let it crank away — and toast some marshmallows above the CPU because it’s going to get very warm.”
A rack of servers based on NVIDIA Grace Blackwell GPUs requires 120 kilowatts. But they also are 1,000 times more capable than previous generations, which equates to 500 times more computation per unit of power, according to Rich Goldman, director at Ansys. “If you build a data center of CPUs, and want to do a lot of computing, you can replace the CPUs with the latest GPUs and bring the power way down for the same computing capabilities. We are still going to have huge power issues, because with AI we’re doing more computing, but that will help.”
Hand sees the specialized Grace Blackwell-type of purpose-built hardware for AI as the next step in development, but he adds that much more can be done because these are still general-purpose chips. He said it is possible to build more efficient AI chips that are meant to classify and work with a broad classification.
“This is what we see NVIDIA doing, this is what we see Google and a lot of the cloud providers doing,” Hand noted. “There are levels of efficiency that we can do to enable those chips to be produced more efficiently, and to enable everything around sustainability that goes along with that. Then, you can go one step further with new technologies that take a fully trained AI model and convert it into hardware. That’s going to be the most efficient you can be, and it’s going to use a fraction of the power. You’re locked in at that point, so then you have to face the challenge that as AI is accelerating and changing so rapidly. How much can you lock down? We’re seeing this with the recent announcements from Intel and Qualcomm on their mobile CPUs, and Apple’s been doing this for a couple of generations. Now you have the TPUs or the NPUs or whatever neural network you want to have on the chip, and there’s just going to be that continuum.”
Steve Roddy, chief marketing officer at Quadric, agreed. “The forces driving the explosion of AI/ML models are massive in scope. Generative AI wave is changing the way individuals and corporations work; possibly impacting millions of jobs around the world. The semiconductor/EDA/IP ecosystem cannot reasonably bend the curve of progress in the evolution of massive AI models that is being propelled by hundreds of billions of dollars of investment from tech titans annually. But the semiconductor and IP industries can help show the way to radically reducing the energy consumption footprint of running those GenAI models in day-to-day inference use.”
The biggest sustainability challenge for GenAI inference is conversion of modes into lower-power consumption integer quantized formats. “Data science teams creating GenAI models are seemingly oblivious to the fact that the reference models they publish in floating point format consumes 10X the energy per inference compared to the same model converted into integer formats (comparing a 32b floating point multiply-accumulate versus an 8 x 8 or 4 x 8 integer MAC),” Roddy said. “Whether inference happens in the data center or on device, the energy savings – and resulting benefits to sustainability – from conversion to lower-power integer formats are substantial. This gap exists today because the data scientists creating new GenAI models are mathematicians, not embedded engineers.”
At the same time, effectively incorporating the high-density compute that AI demands, in a sustainable fashion, will come down to how well we use what we already have at our disposal, noted Mark Fenton, senior product engineering manager at Cadence. “Not everyone is going to be fortunate enough to get a new data center build approved. If we turn our attention toward tackling the issue of stranded capacity—capacity that is available but going unused—we can use the power we do have more effectively. As we see more and more AI applications deployed in data centers, we’re going to see the problems that come with accommodating mixed-density systems. Tapping into stranded capacity, especially with mixed-density systems, is easier said than done when you don’t have performance visibility. Emerging technology, like CFD-powered digital twins, can provide that performance insight into how much capacity you have at your disposal and exactly how efficiently you can run without compromising resilience.”
Technology advances
When the design team decides it’s time to go to hardware, even though AI algorithms are advancing so quickly, there will be a certain amount of stability, Hand said. “It’s a bit like when we went into the evolution of the TPUs, NPUs, XPUs. Generally, they weren’t very viable for a while because the core building blocks were changing more regularly. But over the last few years, those core functionalities have stabilized so that now you can make more energy-efficient hardware. It’s going to be the same here as things start to stabilize. There will always be new innovations. There will always be much more powerful algorithms that need new hardware, and we’ll probably always end up using more than we say because the data centers are a huge energy consumer, but what do they offset? That’s the whole question from a productivity perspective. Are they allowing us to do things we wouldn’t otherwise be able to do. But a lot of that is just whether it’s a complex AI chip or a complex any sort of chip. The EDA aspects are roughly the same. We improve yield, improve performance, reduce power. “
Another consideration is how to generate enough power for the data center. How do you put these data centers in places where there isn’t enough power? “You’ll see data centers more often in places that have a surplus of power, and countries that have a surplus of power will become data center hubs,” Goldman said. “They will locate the data center there, then output the knowledge that you get from that power.”
That will be accompanied by new and novel sources of power. “We’ve heard talk of mini nuclear power plants to power individual data centers, and applications like putting them on top of oil wells so the methane that’s now going into the air and polluting the air, we can burn that and create power out of those to run the data centers,” Goldman said. “We have to get novel like that in order to power these data centers.”
Part and parcel of the AI sustainability discussion is data center efficiency, where there is an ongoing conflict between the power needed to move data and how far that data actually needs to move. “We see people building very big processors out of chiplets, and part of that is because while they know they could build an equivalent system out of two chips that are connected over a bus, but moving the data between them is going to cost a lot of power,” explained Steven Woo, fellow and distinguished inventor at Rambus. “The more you bring things together, the less distance the data has to travel, the less power you spend moving data. But then the problem is that the power density goes up. You have to deliver more power, and you’ve got to cool the thing. So, the tug of war is, how do I cool this thing? I’m saving some power by not having the chips be distant from each other, but now it’s got to deliver more power into that volume, and you’ve got to cool it. That’s the tug of war that system-level architects face now.”
At the data center level, there is increasing use of advanced cooling techniques in servers. “More people are talking about liquid cooling, including cold plates,” Woo said. “Google has been doing this, and NVIDIA has been talking about it for Grace Blackwell. That’s the technology that some of the supercomputers use today, and it’s not a foreign or a new technology. It’s just a matter of time before we see it as pervasive. There will probably be some systems that are air cooled for the foreseeable future, but you’ll see more systems adopting some kind of pumped liquid with a two-phase cooling type system.”
At the extreme end of this approach is immersion cooling. “This may not be a viable data center server cooling solution any time soon, but it might be if we run out of runway on the piped liquids,” he added. “Once you’re able to manage the heat with a liquid, or something like that, then you can confine the temperature range that a chip goes through, which means if it does thermally cycle, it may not hit the extreme temperatures. That’s good, because then you don’t have the thermal expansion and contraction over this bigger range, and some of the other reliability related failures get eased a bit.”
Big data analytics also plays a role in sustainability by identifying trends in chip design based on data center usage. “There’s a whole train of power,” said Adam Cron, distinguished architect at Synopsys. “The concerns are at the die level. Then you have the board it’s on, the rack it’s in, the farm it’s in, and then it goes all the way back to the transmission lines. And what data center managers like to see is a nice even keel. Don’t rock the boat too much, because if you want a lot over here, then you may have to drop down over there, or vice versa.”
Ideally, power would be reduced everywhere. “Data centers are using between 1 and 2% of the world’s electricity, and they’re on a crash course to use it all,” Cron said.
Yield, manufacturability links
There is an interesting link between the effort to make sustainable semiconductors and yield and manufacturability, Siemens’ Hand noted. “These chips are getting bigger. Therefore, yield becomes more of a problem. The more we can do with yield, the better we can be for sustainability, because a dead die is a waste of resources. A good die is always a better thing. What can we do there? There’s a lot of work that we do in the EDA-semiconductor ecosystem to improve yield. Ironically, some of it uses AI. Certain commercial fab solutions use AI to get better yield, and it’s a fascinating interplay between the sustainability cost of AI and the sustainability opportunity of AI and how they work together. Generally speaking, it’s a net positive. But how do you measure it? I don’t know that you can, because it’s hard to find the raw data that says what it costs to train these models. It’s expensive to use them. But how do you then measure the insights? That’s probably a whole PhD’s worth of work to try and calculate that impact. I believe there is an opportunity to refine automation more, and AI technologies can be leveraged here. It’s just EDA doing its EDA thing.”
Thermal analysis and aging also will play into AI data center sustainability. This is not unique to AI, of course. It applies for all chips,” Hand said. “You can start to understand what are the thermals at the die level, at the sub-die level, to understand the implications. Now, can I put a 3D stack together and have it be reliable? Can I run these things at a sufficient power level that I’m not accelerating their demise? These capabilities will allow us to go even further. Also, as we start expanding the definition of the digital twin and start to expand what’s included in that, you can start to get to see the system-level impacts. And while not specific to AI and sustainability, it is where we’ll start seeing big jumps in efficiency because how much is over-engineered at the moment? How much has too much waste inside it? Once we can start to have a more consistent product-level and problem-level design, there are benefits there, as well. Bringing it back around to AI specifically, to get to true multi-domain comprehensive digital twins, AI will be needed because AI is going to be doing the cross-domain analysis for you and the surrogate model extraction and the automatic fidelity adaption. There will be all of those. The more AI we need, the more AI we use.”
Conclusion
Sustainability is not a new topic, but it was always a topic for some future discussion. That’s no longer the case.
“Hardware and software vendors have understood for years that compute scaling and sustainability could be on a collision course with reality, but they don’t have to be if the industry leverages its ecosystem capabilities,” said Arm’s Yusuf. “Take Arm Total Design, for example, which is an ecosystem of leading companies from across the semiconductor industry dedicated to enabling efficient, custom silicon solutions for AI/ML use cases. As such, it provides partners with preferential access to Arm Neoverse Compute Subsystems (CSS), pre-integrated IP and EDA tools, design services, foundry support, and commercial software support. It also represents a new mindset that fosters collaboration, flexibility, and innovation, while understanding that the age of AI needs to co-exist with global sustainability objectives.”
What’s changed is that sustainability has moved from “nice to have” to “must have.” From its perspective as an industry leader, Intel has created Seven Tips for using AI without the high environmental cost. Additionally, Intel believes that being intentional with how AI is implemented to design is key to achieving sustainability goals. With intentionality, those who execute AI initiatives can reap the benefits of optimized workloads (see here for more), and a proactive approach to project design and IT management is critical to maximize the impact of AI initiatives and minimize carbon footprint (see here for more use cases).
Fundamentally, lessening AI’s impact on sustainability will come down to the technology itself. Quadric’s Roddy said the semiconductor ecosystem can help by “increasing general awareness of the necessity of using lower-precision data formats, as well as building more automated tooling to bridge the gap between mathematician and embedded engineer.”
Related Reading
IC Manufacturing Targets Less Water, Less Waste
New technologies and processes help companies strive for net-zero.
Goals Of Going Green
Net zero goals target energy, emissions, water, and factory efficiencies.
What about MRAM as a solution to reduce dramatically the power consumption of data centers?