What Is A Custom Processor?

The definition has changed, and so has the impact on the design process.

popularity

Spurred by the latest cyclical development boom, the semiconductor industry is entering a new golden era of custom processors, but this time ‘custom processor’ means something different.

A generation ago, every major semiconductor company had in-house processors: SuperH, PowerPC, V800, Alpha, MEP, Trimedia, etc., with some specializing more than others for particular domains. But industry consolidation and the enormous expense of maintaining proprietary architectures caused many of these to fade away, and the industry entered a long period of ‘Standard Architecture,’ while custom processors filled niches for applications like audio processing.

“The last few years have seen the emergence of domain specific cores for image processing, wireless baseband, LiDAR, graphics and neural networking,” said Chris Jones, vice president of marketing at Codasip.

As such, a custom processor has grown to mean a processor optimized for a particular class of tasks. The microarchitecture and instruction set are informed by the software it ultimately will run. Today, demand is high for customization tools that can realize proprietary instruction extensions of a standard ISA.

“It is a highly efficient and low risk way for a design team to implement their ‘secret sauce,’” Jones said. Further, the design process is now more than ever about software first, which has created a demand for modeling and profiling tools to nail down custom architectures. “The RISC-V movement has contributed greatly to this momentum behind customization as its modular architecture provides space for non-standard extensions, and proprietary software IP techniques can be embodied in custom instructions without sacrificing the benefits of an industry standard ISA and the accompanying ecosystem.”

A custom processor used to mean a CPU designed from scratch, but the definition has evolved and expanded over time.

“The availability of user-configurable IP meant that designers could choose bus and register widths, cache sizes, pipeline stages and other processor features most suitable for their targeted application,” said Nicolae Tusinschi, product specialist design verification at OneSpin Solutions. “While the resulting processor might not be considered custom, the large number of possible configurations meant that the specific set of features chosen might be unique among all users. If the processor IP was delivered or generated in RTL form, and if the license agreement allowed it, the user might change the design to the point that it was truly custom.”

He agreed that the RISC-V ISA has expanded the concept of a custom processor even further as its instruction set architecture defines different data widths and multiple categories of optional instructions, including privileged mode extensions and variants. Any selection of these features is considered ISA-compliant.

“Since RISC-V is an open architecture not tied to a single vendor, users can choose from many different processor variations from multiple IP vendors and even from open-source repositories. RISC-V was expressly defined for implementation on a wide range of microarchitectures, from simple controllers to parallel-processing systems with out-of-order execution, multi-level caches, and other advanced features,” Tusinschi said.

The RISC-V ISA also accommodates the addition of user-defined extensions, such as new instructions, which provide designs even more flexibility in developing a processor ideally suited for the end application, Tusinschi said. “Thus, many RISC-V processors are truly custom, with a custom selection of features, a custom microarchitecture, and custom extensions, while remaining compliant to the ISA. This more complex design process has major implications for verification. Compliance is not enough. Any RISC-V verification solution must handle optional features, verify the complete design including the microarchitecture, and be flexible enough to encompass user extensions.”

At the same time, demand for high-performance compute is constantly increasing in consumer, industrial and automotive products to deliver innovative and ‘wow’ experiences.

“Power and thermal constraints are driving the need for high power efficiency (performance per watt) in addition to high performance,” noted Lazaar Louis, senior director of product management, marketing and business development for Tensilica IP at Cadence. “Custom processors help meet these needs.”

As an example, a consumer video-calling product requires several domain-specific processors including audio, image and AI processing to deliver a compelling experience. Similarly, an autonomous vehicle requires signal processors to pre-process camera, radar, lidar and ultrasound sensor data. The next step is to perceive the surroundings of the vehicle, including the location of pedestrians and other vehicles. The following step is decision making to estimate path planning and driver assistance, Louis explained.


Fig. 1: Autonomous vehicle processors. Source: Cadence

Many of the custom processors in the past used proprietary instruction sets. The downside of this approach is that it limits what these processors can be used for. The upside is that the tool sets and architectural changes are maintained by vendors with a vested interest in making sure they are reliable and secure. RISC-V, which was developed by a group at UC Berkeley, allowed open instruction set to be extended as needed.

“The major issue with custom processors was the verification of custom instructions along with the base core set to ensure that any customization did not change overall functionality,” said Dave Kelf, vice president and chief marketing officer at Breker Verification Systems. “Custom processor companies often performed this verification process for their users. RISC-V will require similar verification mechanisms, and the open-source community that has sprung up around the processor may well provide this. A verification platform that auto tests the subsystem around the processor and is modular enough to add the necessary instructions will allow for full testing of the processor together with customized instructions. Such a system will significantly add to the success of RISC-V extensions.”

The sliding scale of customization
To be sure, there are differences in what people mean by a custom processor, particularly in terms of the level of customization. With so many options and use cases, customization appears to follow a sliding scale. This is where development can get complex without the tools.

Synopsys and others make a tool available that allows engineering teams to design custom proprietary specialized processors. “With the tool, you specify the processor, and from there you generate both the RTL, the software development kit, which is an important piece, the instruction set simulator, the compiler and the debugger, and the whole GUI infrastructure,” said Markus Willems, product marketing manager for ASIP tools at Synopsys.

These tools have wide applicability, Willens said. “We’re seeing all kinds of customers coming with all kinds of different ideas on what they want to do in terms of customization. It ranges from changing a given ISA and specialization on the microarchitecture to the next level where a given instruction set architecture is extended to add specialized instructions. That always starts from a given ISA as a starting point, all the way to building something very tailored, which is a very tuned ISA and which from the outside might be seen as a piece of RTL or dedicated functionality. That could have been done in plain, fixed RTL implementations, but where you want to replace the state machine with a more programmable entity to keep some flexibility and also to reduce the complexity of the state machine. There is a spectrum of customization happening.”

This, in turn, speaks to the number of tradeoffs that must be made today by system architects on the path to the most optimal design. Increasingly, Willems has seen those architects achieve clarity on how to make the tradeoffs. This practice has come a long way.

“In the history of processor design in the ’90s, we saw little innovation in a way, at least on the instruction set itself, because there you could just rely on the [manufacturing] process giving you the improvements,” he said. “You go to the next node and you get the 2X, 3X, and there was not much you had to do to go to higher frequencies. But that ran out of steam, and we moved into multicore designs, initially more heterogeneous, so you would just do multiples of the same. But this resulted in a certain saturation. Now we’re clearly seeing an age of more specialization for tuning the multicore architecture into more dedicated cores.”

Chipmakers typically have a fairly good understanding about how to slice the algorithms in terms of what to put into which kind of specialized processor. They know which ones require higher performance and which ones have to be specialized for reaching certain throughput requirements, and what are the most power-hungry components. But then comes the challenge of figuring out which parts of the algorithms and which parts of the applications should map to certain things. That affects timing, throughput and power, and this is where EDA has a big opportunity for helping engineers explore various architectures in a short period of time, particularly at a C-programmable level.

“No one dares to say, ‘I want to do assembly coding for a range of specialized processors’ and find out that I picked the wrong processor,” Willems said. “High-level programming languages of all kinds are also a key element that we’re seeing in custom processors, and you can see that on the graphic side with Cuda, and in neural network processors where the programming language is essentially a graph that you’re using as the entry point. But for the majority, it’s still C and C++.”

Working at a high level of abstraction helps to understand performance, so having an accurate model of the processor and running key kernels on that processor are essential. But there are more optimizations possible at the microarchitectural level.

Memory design adds a whole other level of optimization. “Very often, it is not the processor itself,” said Willems. It’s getting the data in and out and making sure that the data is available to the different processors in time. The tradeoffs in different memory architectures and I/O interfaces to the processor are key elements of custom processor design.”

Design challenges of custom processors
Key metrics in any processor design involve power—performance per watt or milliwatts per operation.

“Very often it’s number crunching, so you have to analyze all the sensor data, and it’s a lot of combined signal processing with some decision making that you have to do at these edge devices, Willems said. “Designing a compiler for such a processor in conjunction with the hardware, and designing the simulator is a task that requires completely different skills. You need to bring those skills together in a team. The teams must be organized accordingly so that the right expertise is there. If you’re coming from a hardware background, and now you decide to become more specialized in custom processors, it means you cannot buy it from an IP provider. You start to design it in-house, and the subject of a software development kit is the one thing that is a hurdle for the adoption.”

In the evolution of custom processors, in order to make it possible for engineering teams across the breadth of the semiconductor ecosystem to reach their goals, success comes down to advancements in ISAs, tools, teams and tradeoffs. The momentum has swung toward capturing ideas in an executable format, creating an early simulation model, and performing profiling-based analysis rather than the antiquated spreadsheet approach.

But as AI and machine learning proliferate, the opportunity for specialized processing elements continues to grow.

“The more architectures you see, the more people get inspired,” Willems said. “The whole market triggers itself just by the fact that people are successful in the market with more specialized processors.”

Related Stories
Open Source Processors: Fact Or Fiction?
Calling an open-source processor free isn’t quite accurate.
Looking Beyond The CPU
While CPUs continue to evolve, performance is no longer limited to a single processor type or process geometry.
Chiplets, Faster Interconnects, More Efficiency
Why Intel, AMD, Arm, and IBM are focusing on architectures, microarchitectures, and functional changes.



Leave a Reply


(Note: This name will be displayed publicly)