SPONSOR BLOG

Why Reconfigurability Is Essential For AI Edge Inference Throughput

Getting the best performance for a convolutional neural network requires an architecture that is flexible.

May 6th, 2021 - By: Vinay Mehta

For a neural network to run at its fastest, the underlying hardware must run efficiently on all layers. Through the inference of any CNN—whether it be based on an architecture such as YOLO, ResNet, or Inception—the workload regularly shifts from being bottlenecked by memory to being bottlenecked by compute resources. You can think of each convolutional layer as its own mini-workload, and so the challenge of any inference solution is to ensure that all of these different mini-workloads run efficiently. Having an architecture that can adapt to the changing requirements of the inference by handling both extremes of computational intensity (layers with few filters or with many filters) is crucial for getting the best performance.

As we discussed in past blog posts, not all convolutions are the same because the compute to memory access ratio can change dramatically between different convolutions. As a rule of thumb, the ratio of compute operations to bytes required is set by the number of filters in a convolutional layer. Thinking about it another way, the more features you are looking for in a single layer, the more you are reusing the data you have accessed to perform computations.

In the chart below, we see that for a few different models, each has many different convolutions, and these convolutions all lie on different points of the compute operations to memory spectrum. As it also turns out, most convolutions are unique to each model, so the ability to run one model doesn’t necessarily mean that you can run another! Instead, it’s better to look for an architecture that is flexible enough to handle all the different kinds of models.

Future proofing, and the role of the compiler

Beyond offering a more optimal datapath, a flexible architecture also provides the advantage of being able to accommodate future versions of your model. As your model evolves, so must the architecture beneath it. Reconfigurable computing is a must have for any application with a long life and iterative updates, which is something you would find in an edge or embedded device. This flexibility ensures that you can update your model in the field without worrying about whether the underlying hardware can support it.

Ultimately, the ability for the underlying hardware to reconfigure does not need to be exposed to developers who are already focused on achieving maximum accuracy for their models. Instead, a good edge inference solution should include a robust compiler that is able to leverage the reconfigurability of the underlying hardware automatically, reducing the cognitive load required by deep learning engineers to achieve maximum performance. The benefits of reconfigurable computing such as maximizing performance and ensuring future support can only be unlocked by a compiler stack developed in sync with the hardware.

In summary, if you are looking for the best performance for your edge inference workload, you should look for a solution built on flexible hardware with a mature compiler that can unlock the benefits of the reconfigurable computing platform.

Vinay Mehta

(all posts)
Vinay Mehta is the inference technical marketing manager at Flex Logix.

Knowledge Centers
Entities, people and technologies explored

Startup Funding: Q1 2025

AI chips and data center communications see big funding; 75 startups raise $2 billion.

by Jesse Allen

Advanced Packaging Fundamentals for Semiconductor Engineers

New SE eBook examines the next phase of semiconductor design, testing, and manufacturing.

by Bryon Moyer

Chip Industry Week in Review

AI export rule to be scrapped; SEMI, EU request; Cadence, Nvidia supercomputer; AI co-processor; Imagination's new GPU; semi sales up; imec, TNO photonics lab; NSF key to national security; flexible packaging control system; SiConic test engineering; USB 4 support; SiC JFETS; magnetic behavior in hematite.

by The SE Staff

Why Reconfigurability Is Essential For AI Edge Inference Throughput

Future proofing, and the role of the compiler

Vinay Mehta

Leave a Reply Cancel reply

Technical Papers

Knowledge Centers
Entities, people and technologies explored

Related Articles

Startup Funding: Q1 2025

Advanced Packaging Fundamentals for Semiconductor Engineers

Chip Industry Week in Review

Chip Industry Week in Review

RISC-V’s Increasing Influence

Chip Industry Week in Review

What Exactly Are Chiplets And Heterogeneous Integration?

Big Changes Ahead For Interposers And Substrates

Sponsors

Recent Comments

About

Navigation

Connect With Us

Why Reconfigurability Is Essential For AI Edge Inference Throughput

Future proofing, and the role of the compiler

Vinay Mehta

Leave a Reply Cancel reply

Technical Papers

Knowledge Centers Entities, people and technologies explored

Related Articles

Startup Funding: Q1 2025

Advanced Packaging Fundamentals for Semiconductor Engineers

Chip Industry Week in Review

Chip Industry Week in Review

RISC-V’s Increasing Influence

Chip Industry Week in Review

What Exactly Are Chiplets And Heterogeneous Integration?

Big Changes Ahead For Interposers And Substrates

Sponsors

Newsletter Signup

Popular Tags

Recent Comments

About

Navigation

Connect With Us

Knowledge Centers
Entities, people and technologies explored