Low Power Still Leads, But Energy Emerges As Future Focus

More data, more processors, and reduced scaling benefits force chipmakers to innovate.


In 2021 and beyond, chips used in smartphones, digital appliances, and nearly all major applications will need to go on a diet.

As the amount of data being generated continues to swell, more processors are being added everywhere to sift through that data to determine what’s useful, what isn’t, and how to distribute it. All of that uses power, and not all of it is being done as efficiently as it can be. The result is that chipmakers are now beginning to focus on how to slim down these devices and make them more energy-efficient, a mammoth undertaking that will consume much of the semiconductor industry for the foreseeable future. Put simply, power is becoming the most important design metric as more and more chips are developed — and improving energy efficiency is the biggest knob to turn.

“This includes chips we wouldn’t have thought of in the past, such as those used in automotive and spanning from infotainment, digital cockpit and sensor fusion to autonomous driving,” said Tom Wong, director of marketing for design IP at Cadence. “This ‘diet’ can be accomplished by integration to consolidate the number of chips, or by moving to the next more advanced process geometry and getting whatever benefits you can through More than Moore.”

The focus on power is particularly important for increasingly autonomous vehicles, which must manage many more sensors with more intensive computational needs, often with fixed energy budget that impacts the range a car can drive.

“Many of the systems that used to be mechanical are being electrified, and a lot of electronic computation is occurring all the time by systems that are always on demand,” Wong said. “The digital cockpit requires many integrated chips, all needing lower power to manage higher display resolution, more electrification of systems such as e-mirror, in-cabin driver monitoring, and ‘sentry-mode’ like security, etc. An e-mirror requires more power consumption than you think compared to traditional mirror. And OTA software updates require a wireless or cellular connection, which also consumes power. These systems are becoming so ubiquitous that they need to be low power. Especially for electric vehicles, conserving power in any auxiliary system is paramount, as you have a fixed battery capacity whose main function is to provide energy to drive the car. You can’t have the AC, infotainment, e-mirror, etc., consume a lot of power since it’s a fixed commodity, so low power is crucial. You can increase energy capacity by adding more battery, but at roughly $100 per kWh, the costs can escalate very quickly.”

Across the board, power is now tipping performance as the most critical design metric. “People want to design IPs and SoCs that are energy efficient, and they want the tools to have ways to take power as the primary constraint, as the primary metric, and then do the analysis,” said Qazi Ahmed, principal product manager for PowerPro at Mentor, a Siemens Business. “It could be a synthesis tool. It could even be a downstream tool, like a P&R tool. They want tools to do that.”

Rob Knoth, product management director in the Digital & Signoff Group at Cadence, agreed there is an acceleration of power becoming more critical across all products. “Power is clearly above all. It doesn’t matter what customer I’m talking with, what the end application is. Power is increasingly the specification being used to define, or is somehow limiting. It’s a great thing because it unifies together so much of what we’ve all been working on in the whole semiconductor ecosystem — bringing together functional verification with implementation, sign-off, optimization algorithms, and now, machine learning, to help take it another mile.”

Energy vs. power
Alongside of the focus on power is a growing interest in energy. The ultimate aim is energy-efficient designs, where more can be done with less energy. “Engineering teams want to look at designs, not just from a perspective of power, but from the dimension of energy itself,” Mentor’s Ahmed said. “They want to understand where the energy is being consumed, and how efficient the consumption is.”

There is a method to this. “First, if you just look at the power, let’s say the designer has power analysis, and the IPs dissipate a certain amount of power,” he said. “There’s peak power, then there’s the average power. You can plot the power and see it behaves differently in different performance and functional modes. The key question, if you want to achieve energy efficiency in the end, is where do you focus in the design? Power in itself may sometimes not be good enough to lead you to the problems because some of the workloads or scenarios could be more dominant. So although they may consume a lesser amount of power compared to some other IP or mode that appears to consume less power, maybe the one that consumes less power is more dominant, so that in the end the total energy consumed by that particular IP or part of the chip or particular scenario ends up being more dominant. That probably would be the place to focus.”

Knoth said energy is a new dimension to explore, and expects to see activity grow in this area. “It isn’t going to be for everyone, but for the people who are able to pivot to that, it’s tremendously powerful. We’ve spent so much time with the key optimizations, trying to do power reclamation before sign-off. But there, you’re struggling over 1% and 5% power gain. The people who are really working on energy, they’re cutting their effective power in half. It completely changes the game. For those teams designing IP, this is very important. But where it’s the most impactful is for companies that are vertically oriented and which own the whole stack, from the firmware up. They’re able to optimize that whole system, because they own the whole thing. The bigger your solution space is, the bigger set of variables you have to work with, the bigger end impact you can have. It’s not going to happen for everyone. But for the people who do operate at that level, they’ll be able to make some pretty transformational gains.”

This is good for tools companies, too. “There are very strong correlations between those approaches and certain types of design automation tools,” Knoth said. “There’s a much broader adoption of hardware-based emulation, because in order for you to work on power and energy, you need to have incredible functional stimulus, and there’s nothing better than the actual software that you’re going to be running on a device.”

Still, while top-tier customers are looking the possible improvements in energy efficiency, that doesn’t mean solutions are in place. “There are developments happening with multiple EDA vendors to try to do something about it,” said Ahmed. “Existing tools do not have the capability to report energy metrics, energy efficiency, energy per toggle, energy per cycle, and all these kinds of metrics that may be important for such analysis. Also, when you talk about energy, you’re talking about the work done per unit time. So if you are looking at utilization versus performance, you want to understand if you utilize something 100%, and you have X amount of power when it’s utilized 50%, is the power cut by two — or maybe more? That’s one way to look at it. EDA tools right now do not report energy-related metrics. They do not have energy reports. And one of the important factors to look at power and energy together, is to be able to characterize the designs for a large number of workloads. If you have 100 different scenarios to look at for power optimization, because you might be optimizing power in a mode that is not dominant, you want to characterize it for hundreds of workloads, and typically at large scale. It might be an ultra-large SoC. So the challenge is one of scale, as well. EDA tools are not prepared to handle hundreds of workloads for ultra-large SoCs. Emulation certainly helps. But still, there’s a long way to go in qualifying the right workload for power, and then building it from the standpoint of both energy and power, and having capabilities to analyze that by overlays and graphs and other kinds of analyses.”

AI everywhere
Much of semiconductor design today involves designing, using, and applying artificial intelligence/machine learning/deep learning in many different applications. Many of these applications are extremely power-sensitive.

“Artificial intelligence and machine learning gain ground when their complexity gets pushed into the background,” said Jem Davies, vice president, Arm fellow and general manager of Arm’s Machine Learning Group. “Over 1.5 billion people enjoy ML algorithms when they take smartphone pictures, or subsequently search for them in their ever-expanding photo files, generally without knowing it. The same phenomenon occurs whenever someone issues a command to one of the estimated 353 million smart speakers deployed worldwide. Invisibility works.”

Davies expects to see that invisibility spur the adoption of many applications. “One-click smart parking will likely be the first experience with autonomous cars for many. Security systems that can accurately differentiate between the sound of a nearby prowler and a wandering raccoon will attract consumers. Invisibility, however, remains hard work. Improvements in CPUs, NPUs and GPUs will be required. AI processing also will have to shift to devices to save energy and cost, putting an emphasis on creative, elegant algorithms that minimize everything — storage, bandwidth, compute, and power. We also will have to give consumers a much-needed sense of privacy and data autonomy. If we don’t give individuals a better way to control how AI impacts their lives, it could become the biggest roadblock of all.”

Pruning designs
Engineering teams are taking different approaches, many of which point toward more targeted and flexible design strategies. “One of the paths is the more traditional one of getting to the next process node, and how to continue reducing the power, both with the newer process nodes as well as kind of smarter architectures,” said Steven Woo, fellow and distinguished inventor at Rambus. “The poster child for all of this has been AI, where many people are designing domain-specific chips. These are purpose-built, which means you can get rid of all the stuff you don’t really need because it’s not a general-purpose engine, and you can more highly optimize.”

For high-performance compute and networking, that may not be sufficient. The cost of powering and cooling data centers is already high, and with more intensive computation with increasing amounts of data, the trend is toward even higher energy consumption.

“There is a lot of pressure to get the power down these days,” said Manmeet Walia, senior product manager for mixed-signal PHY IP at Synopsys. “The reason is that the amount of data that is passing through these networks continues to increase. As the rate of data continues to increase, concurrently the serial speeds also will need to increase. As a result, there’s a lot of pressure to get the power in terms of picojoule per bit. All of a sudden we are moving from 25 Gb/s to 50 Gb/s SerDes, and then from 50 to 100 Gb/s. Even though they are occupying the same amount of space on the chip, they now need to transmit twice as many bits.”

Many of these chips also are at reticle size, which poses other problems. “Even if you can fit within the max reticle, your yields are low,” said Walia. “If you have the smallest error, the whole die is wasted. A lot of our customers are building really large dies. They end up getting like 10 or 15 chips on the entire die because the yields are very low. What these companies are now doing is taking the chiplet path. With chiplets, they can break the large die into smaller dies, and that solves the area problem, and it’s still easy to package it all together and dissipate all that thermal heat through that packaging.”

The increase in data also is one of the drivers behind memory-centric compute, which includes everything from integrating CPUs and GPUs directly into memory devices to high-speed interconnects inside 3D chiplets.

Rob Aitken, an Arm fellow, noted that moving data takes as much or more energy as computing. “If you can keep the data where it is and move the compute to the memory, energy cost goes down,” he said. “Some have shown that 62.7% of total system energy gets spent moving data between main memory and compute. Other critical benefits — reducing the relative need for internal bandwidth, getting around the problem of limited ‘beachfront’ real estate on chip edges for connections, being able to use energy savings ordinarily consumed in transport for other purposes — flow from the shift in thinking. The ideas range from adding processors to memory cards to building customized memory instances with compute capability. We don’t know which will succeed, but as a concept, memory-centric computing is inevitable.”

Tightly integrating hardware and software has a big payoff, as well. “With a lot of the companies designing their own chips, the ones that are seeing a lot of success are worried not only about the semiconductor design, but they’re also worried about what that means for the software,” said Rambus’ Woo. “They’re trying to take the inputs from the software as far as what’s needed to make a good application. They’re also trying to say, ‘Here’s some tools that, maybe if you write your code just a little bit differently, you can take advantage of this.’ That has a tremendous, almost disproportionate power savings, compared to what you might do traditionally. Co-design is really becoming important.”

It goes well beyond software, too. “Co-design now includes the package, which means consideration must be given on the hardware side about not only the fundamental circuits, but all the way up through how this device is packaged and where the components are placed,” Woo said. “Especially in AI, they’re no longer in the data center. They’re not really single-chip solutions anymore. Many times, multiple components are strung together, and it starts to get to be this really interesting question of, ‘Well, how many of these things can I put on a board? If I can put more of them on a board, it’s less communication back and forth.” The traditional thinking might be, ‘If I make really big engines, I can do a lot more work without going off the chip. But that becomes hard to cool and hard to package, and maybe the better choice is somewhere in the middle where it’s not really tiny chips, but it’s very good chips that have the right kind of computation to communication ratio built in. The communication is the thing that’s killing you, so they’re really trying to figure out how much compute can be packed into these chips, and then what’s that right size so that it keeps the communication manageable.”

Adding to the complexity, not all data is the same. The volume of data is increasing, but it’s also getting more complex. So instead of just passing along some video content, now it’s the training, genomics, neural networking, and video transcoding.

“This is consuming the CPU cycles,” Walia said. “The CPUs are running at 100% of their capacity, and this means you need more and more chips. You need more and more accelerator chips around the CPU, and these specialized accelerators can handle this complex data better. This is chips talking to each other, systems talking to each other, machines talking to each other, and all at very fast speeds. Think about your server box. Previously, the server box just had the CPU. Now it has the CPU, and it has all these accelerators to help manage this complex data. All this is leading to more and more chips within a box, and more and more power within the box. The traffic, from what I’ve heard and seen, is 80% machine-to-machine traffic. Only 20% of it gets out of the machine to the users.”

What makes the data complex is that it is more than just requesting a 4K video. “It needs to read your behavior, it needs to train through voice recognition, etc.,” Walia said. “So there’s a lot of machine to machine traffic, which requires a lot of parallel compute. It does need performance in the sense that it needs performance for these specialized complex data, but also low latency. For example, when you’re on Netflix and as you’re scrolling through the Netflix site, it has to tell you really fast what’s the next movie that you’ll want to watch based on the previous activity. By upgrading the SerDes, it means the SerDes has to work a lot less hard because it has to talk to the optics, which is sitting right next to it. It can be co-packaged optics or onboard optics with the SerDes, and it’s going to put in a lot less power now.

Performance still matters
Looking into 2021, it’s the same mantra — bigger, better, faster.

“Mainstream and gaming CPUs are going above the 3GHz threshold, and will likely go even higher in 2021,” Wong said. “Typically, companies resort to multi-core to get the same performance, but have kept the clock frequency below 3GHz unless it’s a server application, given the greater performance required and tolerance for higher power consumption. Heterogenous SoCs will continue to ramp up performance while maintaining low power. Hyperscale computing will drive faster switches and networks from 50G to 100G to even 400G networks. 112G-LR single-lane SerDes are expected to become mainstream in 2021, with an eye toward 224G and beyond in 2022. Look for accelerated development of these faster SerDes in 2021. This will also drive early adoption of sub-5nm silicon. Advanced memories such as GDDR6 and HBM3 will be needed. Chiplet and die-to-die connectivity is an essential technology to enable this direction for high-performance computing.”

Geometry and architecture also will play a bigger role in decisions about power. “We’re now looking at 3nm being used to tape out some designs, and as we go down that sort of a geometry, the RC characteristics are much more pronounced,” said Ahmed. “For example, the resistance is now a bigger factor than it used to be. Earlier, it was the capacitance that dominated everything. When you have resistance dominating, you have to have different kind of analysis. And then you have two competing parameters, performance, and power. They are opposites, so when it goes to the geometry, you need bigger-size cells to have performance. But you are going to a smaller technology node, so how do you solve that problem?”

A lot of challenge on the power versus performance part is being handled by technologies like adaptive voltage scaling and DVFS. “In fact, engineering teams are going toward near-threshold computing (NTC), where we try to lower the voltage to achieve a certain performance and see if we can still achieve the performance for a very low voltage without resulting in too many errors,” he said. “Error correction can handle some sort of errors, let’s say, 1000 in a factor of 10 billion. Maybe that’s acceptable as a one percent error rate could be acceptable, and you could lower the voltage until you don’t exceed that. For these reasons, near threshold computing is also something that’s becoming true in order to save power because voltage is a squared factor in the power equation, he explained.”

While performance has been the primary goal in semiconductor design from the beginning, it’s no longer the dominant concern. Plenty of computing horsepower is available with today’s CPUs, GPUs, and NPUs. But there also is far too much power being used, and that is creating problems.

“We can achieve the performance that we need in a GPU, for example, which is much more efficient in comparison to a CPU,” said Ahmed. “For this reason, a lot of compute engines are moving to GPUs now. Performance is something that designers are able to achieve even though the constraints are different. Now, it’s not the timing. It’s not how to scale up the performance. It’s how to maintain the performance while you contain the power, and that’s the challenge.”


Fabiana Marzano says:

Hi! I read your article, and yes it’s true that the energy efficiency is more important than ever right now. The evolution we are living today, this electrification of everything will put in check the current energy systems. The semiconductor has a great responsibility in the energy losses. For decades we manufactured devices, without that energy awareness. Now, we have to change it, and making greener semiconductors, from the power perspective, became a big challenge… we live in a Era where we lean on softwares to find our answers, but this kind of answer is out of this box. It exists, it demands a lot of work, but it’s definitely possible. Project and process are equally important !

Leave a Reply

(Note: This name will be displayed publicly)