There is no clear formula for what to use where, but there are plenty of opinions about who should make the decision.
Choosing a processor might seem straightforward at first glance, but like many engineering challenges it’s harder than it looks.
When is a CPU better than a GPU, MCU, DSP or other type of processor? And for what design—or part of a design?
For decades, the CPU has been the default choice. “It is deliberately designed to be pretty efficient at all tasks, is straightforward to program, generally has good tools and readily available middleware/libraries,” said ARM Fellow Jem Davies. “Since it is truly a digital world — all problems can ultimately be expressed as code — then the CPU remains a possible choice for all tasks.”
But for some tasks it it isn’t the best choice because it’s not always the most efficient. “You would not choose to do modern 3D graphics on a CPU, for example, because it would be too slow or consume too much power. If a CPU can achieve the performance required, the next thing to consider would be the power efficiency. If you’re only going to perform the task rarely, then power consumption won’t be too much of an issue. But if you’re going to watch a movie, for example, then you need a processor designed to decode digital video as efficiently as possible, or your battery will go flat just before the final cliff-hanging scene.”
Krishna Balachandran, product management director at Cadence, agreed the CPU is the most general-purpose type of processor. “You can pretty much do anything. You can program it to do mathematical computations, you can program it to do repeated kind of processing, you can use it to do graphics processing, as well. You can make it do pretty much anything. It’s also the most complex, and because it is the most general-purpose, it is not optimized for any single thing. By definition, because it is capable of doing a lot, it’s kind of a jack of all trades and master of none.”
Balachandran added that the CPU is excellent at doing tasks in a serial fashion. “It does one thing, it waits for the result, then it does the next thing. CPUs today are pipelined and they try to do more than one thing at a time by resorting to pipelining, and some of the CPU architectures have multiple cores and also do some parallel processing.”
The GPU on the other hand, is an optimized hardware unit used for efficiently processing any kind of graphics. “They involve functions such as texture mapping or rotating images or doing some kind of operations like shading,” he said. “In the more modern context of smart phones, you have motion compensation, or some kind of video and coding or decoding. Those types of functions are best suited for a GPU, which can accelerate these kinds of tasks. They consist of thousands of small cores, in some cases, and these cores are replicated located over and over. They do this task very efficiently and they do it all in parallel.”
Then, there are DSPs, which are specialized pieces of hardware that are especially good at mathematical computations, and MCUs, which are very specialized CPUs but much smaller in size and only able to do a subset of the tasks that a CPU can do.
Point of contention
The choice of what to use where isn’t so simple, though, and it can create a struggle between the hardware and software teams.
Colin Walls, embedded software technologist at Mentor Graphics, asserted the decision should ultimately come down to the software design team leader. “There’s an awful lot of the selection criteria that are really driven by software. It ought to be a software decision but selection of hardware seems to be considered to be a hardware decision.”
In his experience, Walls said it’s very common for developers of all types to go with what they know, so instead of questioning what the optimal processor is for the particular application. “They’ll have an idea what kind of parameters are involved, and they’ll look at what they used last time. If it’s underpowered and wouldn’t do the job, they have to think about it some more. But if it’s a little bit too powerful, maybe the benefits of their previous knowledge and experience are greater than the drawbacks of slightly overpowering the design. That’s actually logical and a good engineering practice. Having the perfect device for every design is probably not a particularly useful thing to be aiming for anyway.”
And while finding the optimal processor might seem worthwhile on some levels, Walls explained that figuring out what the selection parameters are for a processor is pretty hard to do because, in principle, a modern design is going to be dominated by the software development. “To some extent, until the software development has progressed a certain distance, you’re not going to know how much processor power you’re going to need or what kind of facilities necessarily are going to be required — meaning the instruction set of one CPU may be better for one application than another, but you may not really know that until you’ve got some way down the development cycle.”
In a perfect world, the software engineers would start work long before any commitment is made, but he acknowledged that may not be practical because they need to get underway with the hardware design. That takes time, as well, so some kind of compromise has to be struck.
He stressed that software also should start before hardware because there is just that much more work to get done, and it may drive aspects of the hardware design further down the line. “Historically, it was always done the other way around. There’s a ton of work to do to shape the hardware, and the software was almost an afterthought at the end of the design process. Nowadays, that’s completely reversed. In those days, the hardware team was probably bigger than the software team, as well. That’s almost reversed everywhere, too. There are more software guys than hardware guys, there’s more work to be done in that space, but the priorities haven’t necessarily been turned around to correspond in all cases. In some companies I’m sure there are very enlightened teams that would take exactly the right attitude, but from my observation that is not common practice.”
Not everyone agrees the software team should choose the processor. Matt Gutierrez, director of marketing for Synopsys‘ Solutions Group, said it really depends which one is better suited to do this. “The software guys usually have a greater say when there is a lot of legacy code that has to be migrated from one design to the next, and you can imagine that if somebody has hundreds of thousands or a million lines of code for a specific application, aside from the specific processor implementation, the software guys usually will have a big bone to pick if there is a lot of work that they’re going to have to do to port software to a different processor like from a CPU to a DSP, or even from one CPU to another CPU. That task is made easier because there are different levels of abstraction of the software stack. For example, in the old days if you programmed in assembly code the specific machine instruction for the processor, and then you switched to another one, that was really painful because you had to rewrite the programs. Nowadays, with C compilers—which aren’t all that new, but they are compared to programming in assembly code—they make that task easier. You do have to recompile code to move from one processor to another, but the task is easier so the software guys’ lives are little bit easier. As well, there are different and higher levels of abstraction that make it even easier, but in general the software team has a bigger say when there is a lot of legacy code that is being used from one design to another.”
Another decision point of hardware versus software in terms of weighing in on the processor decision is the specific processor use, he said. “The central CPU, if it is running, for example, a high-level operating system, to which a whole bunch of application code has been written, is a much more painful thing then a processor that’s doing simple power management functions.”
This gets even fuzzier in complex SoCs, which may contain a CPU, various DSP cores, hardware accelerators, and perhaps even one or more programmable elements. And it likely will become fuzzier still as fan-outs and 2.5D configurations begin rolling out over the next few months, where different types of processors can be added into the same package rather than be separate on a board. At that point, there may be multiple people making choices for different reasons—some based not on hardware or software, but on cost or for very specific marketing reasons.
While multiple options are almost always available these days, there are many ways to get to the same end. Not all of them are optimal, and occasionally they don’t work as planned. In some cases, it may be based upon the way an engineering team is structured or a level of expertise and experience with a processor type and perhaps a particular processor of that type. But no matter how a chip is built, there will always be tradeoffs that aren’t always obvious at the onset of the engineering process.
“Adding specific accelerators, or additional different types of processors, always comes with the additional cost of silicon area,” said ARM’s Davies. “Design tradeoffs always have to take this into account. Complex domain-specific processors often come with tools that aren’t as good as for mainstream CPUs and require greater experience and expertise to get the best performance. Sometimes, for best flat-out performance, designers will have parts of the problem executing in parallel across all of the compute elements in a design.”