It’s not just about productivity these days.
Once upon a time, “behavioral synthesis,” the precursor to high-level synthesis, hung its hat on design productivity as its sole value. By that, I mean, if a behavioral synthesis tool provides a high enough productivity benefit, designers or design managers will boil the ocean to move to it. There was little methodology around it. In fact, even the design entry language was unfamiliar. Yes, it was Verilog or VHDL, but it was “behavioral” Verilog or VHDL—something unfamiliar to most engineers at the time.
As an applications engineer in the late ‘90s, I can’t imagine the number of times I explained that “@(posedge clk)” is a sequential construct, not a parallel construct. “Always” and “initial” are the parallel constructs!
Today, the “unfamiliar language” issue has been solved by using IEEE 1666 SystemC and C++ as the design language of choice when using high-level design and verification. Most hardware engineers know that SystemC allows them to express hardware structure (modules, hierarchy, I/O timing, bit-accuracy) as well as the algorithmic functionality of each block.
That removes a giant barrier to realizing the productivity promise of HLS, but productivity alone does not explain why most of the top semiconductor companies in the world are using HLS. As Brian Bailey aptly noted last year, “At the end of the day, the users have spoken and most of the top semiconductor companies are using HLS. They pushed the EDA vendors into the C, C++, SystemC camp, and they are using HLS to create chips that are making them money.” See Brian’s original post at http://semiengineering.com/tale-of-two-hls-viewpoints/.
So, if it’s it not all about productivity, why are these companies using HLS?
Ok, I admit it. Raw productivity is still a major driver for HLS. I’d be lying if I said otherwise. Actually, productivity is the most common reason why companies come in the door asking about HLS. HLS does indeed provide a fast, high-quality path to RTL. It does this by allowing designers to focus on their design work as compared to detailed and mechanical RTL implementation tasks.
Engineers focus on design work; HLS handles the implementation details.
But HLS is not just about design productivity. It’s about a more productive design and verification flow. You are not only designing hardware earlier, but also verifying it earlier. That means you tend to catch problems and fix them earlier. Better yet, you can reuse many of your verification assets between the behavioral SystemC and RTL. In short, HLS provides a highly productive path not just to an RTL implementation, but to a high-quality well-verified RTL implementation.
If productivity is why new HLS users come in the door, IP reuse is why they stay.
When designing with HLS, the IP you are designing is written in high-level SystemC/C++. “High-level” means that the functionality and macro-architecture are coded in your IP, but the implementation decisions are not. Specifically, the finite state machine, datapath components, multiplexors, pipeline registers, and other such details are not written in your SystemC IP model. Instead, those implementation decisions are determined automatically by the HLS tool by giving it your technology library and synthesis constraints such as clock period(s) and maximum latency in clock cycles.
This means, for example, a high-performance implementation and a low-power implementation can use the exact same behavioral IP. Only the synthesis constraints change for each implementation.
One behavioral IP description supports multiple implementations.
Additionally, by having the IP source code at a higher level of abstraction, you can make more substantial functional changes to the design than you feasibly could when writing RTL by hand. For example, you can change the SystemC or C++ behavior to implement a spec change or even to use a different algorithm. In either case, HLS will then generate modified RTL automatically. Typically, this new (modified) piece of IP will then be added to the behavioral IP library for reuse on a future project.
By the way, if you were at DAC 2015, you may remember Elad Litman from Intel talking about the benefit of behavioral IP reuse in poster session 31.33, “HLS Soft-IP: The New Standard in Soft-IP Creation.”
HLS would not be in the silicon in your pocket, car, and home if it didn’t produce high quality of results (QoR).
Some of the QoR benefit is due to the advanced optimizations automatically performed by HLS. For example, HLS can create many customized datapath components during its automated tradeoff analysis, picking the best one(s) to meet the given synthesis constraints in the given technology. Similarly, HLS can determine how to share datapath logic and registers between pipelines of varying depths, which is normally quite difficult (but not impossible) to do by hand.
But in general, the most common way HLS produces better QoR than hand-written RTL (yes, I said “better”) is that it allows more design space exploration. We all know that architectural decisions early in the design process are the ones that have the most impact in terms of QoR. When designing via hand-written RTL, you typically can only create one RTL implementation of one architecture. However, with HLS, many different implementations can be created from one golden behavioral IP, allowing you to choose the implementation that best meets the needs of this specific project.
As shown below, even a simple inverse discrete cosine transform can be implemented with dozens of different micro-architectures, each providing different PPA tradeoffs. With HLS, you can carefully evaluate each and choose the best implementation for each specific application of the IP.
Design space exploration finds the best PPA trade-off for your specific application.
The common thread, and root cause of all of these benefits, is working at a higher level of abstraction. It’s the higher level of abstraction—working with design decisions vs. implementation details—that makes HLS a more effective way to create hardware.
One piece of advice to any new HLS users: Don’t take that last statement lightly. Although you are working with SystemC and C++, you are still creating hardware with HLS. It is a more effective and arguably more fun way to get to silicon, but it isn’t magic. As with any tool, you still need to invest the time to learn how to use it proficiently for the task at hand.