As complexity continues to rise, the industry must determine how to maintain sensitivity to power and cost and performance in the CPU architecture.
By Ann Steffora Mutschler
SoC and system design is already complicated, but as complexity continues to rise the industry must determine how to maintain sensitivity to power and cost and performance in the CPU architecture.
Where does this stand today—not just with architectures and microarchitectures for consumer electronics but all other kinds of applications? What kinds of changes will be made in the architecture of the CPU itself? How will the industry advance CPU microarchitecture performance and power for the future?
“We see Moore’s Law slowing down—this is the best case right now in terms of getting the benefits of continued process migration to smaller and smaller geometries. It’s not that there’s no gain, but the gains are definitely diminishing from that, at least in terms of directly delivering performance. So people are getting more creative in that regard,” noted Mark Throndson, director of product marketing at MIPS.
He also pointed out in a recent blog entry, “…adjusting configuration options on an existing processor core only takes us so far, and serves more as a preview to the many steps that can and should be taken to advance CPU microarchitecture performance for future products. If process technology migration won’t deliver significantly higher frequencies in the future, and software doesn’t scale beyond two to four core systems for CE applications, then each processor must be designed to do more work/MHz.”
On the software side, Throndson said it may be a bit strong to say that software doesn’t scale. “It’s more that software definitely lags the capabilities of hardware today. Realistically when you look at the typical consumer product, it was actually a pretty significant jump to start building multicore systems running one common operating system and trying to use multiple cores to solve the application processor requirements within these products. I’m making that distinction because we have multiple customers who put multiple cores in a chip for the last 10 or 20 years. But cores have been more loosely coupled—there’s a core in there to do audio processing, there’s a core in there to do security, there’s a core in there to do I/O, there’s a core in there to run the primary OS and applications—and all these cores can talk to each other. They weren’t tightly coupled using one system. That’s what has changed more recently. You see in phones and TVs today tightly coupled multicore solutions for the application processor.”
As for maintaining sensitivity to power and cost and performance in all types of applications, Chris Rowen, chief technology officer at Tensilica, said there is no simple answer. One approach is to look at what the fundamental changes are in the universe of computing that are applying the pressures to the evolution of the CPU architectures and microarchitectures because they are just tools to go solve problems. As the problem changes, the solutions change, he said.
“Traditionally we’ve looked at Moore’s Law as the outside force for the opportunity which enables you to do things differently in microprocessors. Certainly we’ve seen that. They have tracked with Moore’s Law, and lo and behold, as the semiconductor industry delivered us more transistor density we CPU designers went and took advantage of it and threw more transistors at the problem,” Rowen noted.
The mega-change, he said, is that it no longer helps nearly as much to throw more transistors at a problem because while the transistors have gotten cheaper and cheaper at a predictable rate, power has not gotten cheaper—not just in the strictly cost/kilowatt hour sense but in the sense that people can’t afford to run those transistors. “That has some important and pervasive and sometimes subtle impact on how you go about doing it. You don’t just say, ‘I’ll make it bigger,’ except in some applications, such as the high end in the server class. To some extent you can keep throwing transistors at it but not to the degree that you could because even in the cloud, it is more and more of a power constrained environment. People talk very actively about compute/cubic centimeter and compute/watt in a way that isn’t very different from the way they talk about it in the smartphone.”
Further, as a result of mobile computing applications driving the semiconductor industry today, Mike Meyer, a Cadence fellow, has observed the shift toward a variety of solutions to get more specialized processors to get a more efficient microarchitecture in general. “You have things like the ARM big.LITTLE, where they’ve gone to having both a high-performance and a low power compatible instruction set compatible processor so things can shift easily to either high performance or low power. You’ve also got places where people are building specialized processors using something like Tensilica and customizing it or going to the extreme of taking their code and building specialized accelerators, whether it be for graphics or security, cryptography and encryption and decryption, or DSP algorithms that they want to push down into the hardware. There are several different approaches that people are taking there and that trend is going to continue as we continue to try and push down and increase battery life and provide more services on these mobile devices.
How is this evolving?
Clearly, new solutions will occur in response to new challenges. “You’re not going to be able to continue to just be lazy and fall back on frequency, power and area scaling from process technology and you’re not going to necessarily just be able to rely on software catching up with these highly parallel hardware capabilities that we can inherently build today,” Throndson said. Also, besides just getting more aggressive in terms of what performance, power and area specs we’re willing to accept in a mobile product, the tradeoff is the performance and the new capabilities that can be done with this new class of product is more compelling than the fact that the device may have to be charged mid-day, he suggested.
Another way is to design a high-performance microarchitecture to deliver the key features and performance capabilities that are necessary for the application. A task like this requires extensive analysis of the application and feedback.
To capture use-case scenarios in an architecture, Meyer noted, “a lot of it comes back to really having to go do some architectural studies and looking at it and understanding where your time is being spent or your power, as the case may be, and looking at how to optimize that. A good percentage of that needs to be done before you actually develop the chip. There’s a part that maybe later you want to look at tuning the software for low power but certainly there’s a part where you want the flexibility to address it in the hardware too.”
He said many design teams at this point are turning to virtual system prototypes to be able to run the software and the hardware to get an idea of system performance before the chip is built. This technology has matured significantly over the last decade and will continue to play an increasingly important role in architecture development.
At the end of the day, there isn’t a one-size-fits-all CPU architecture, but there is enough research and development happening in the industry to meet the upcoming application demand, as history has proven.
Leave a Reply