SPONSOR BLOG

Challenges Of Edge AI Inference

Common pitfalls in training and deploying CNN solutions, and how to avoid them.

July 1st, 2021 - By: Vinay Mehta

Bringing convolutional neural networks (CNNs) to your industry—whether it be medical imaging, robotics, or some other vision application entirely—has the potential to enable new functionalities and reduce the compute requirements for existing workloads. This is because a single CNN can replace more computationally expensive image processing, denoising, and object detection algorithms. However, in our experience interacting with customers, we see the same challenges and difficulties arise as they move an idea from conception to productization. In this article, we’ll review the common challenges and address some of the solutions that can smooth over development and deployment of CNN models in your edge AI application.

Leverage existing models

We see a lot of companies attempting to create models from the ground up. However, existing models already exist for almost every application, so rather than reinventing the wheel, it’s often much easier to start with a network based on one of these architectures. Moreover, starting with a known model will reduce the amount of time, data, and effort to train a model for your application, since it’s possible to retrain existing models in a process called ‘transfer learning.’ For example, rather than trying to define your own pose estimation network, start with the OpenPose model and work from there with the pose data you need to recognize. Likewise, if you are attempting to perform object detection, solutions such as YOLOv3 offer a computationally simple and straightforward way to get the job done after retraining on your dataset.

Simple models are effective

For most applications, the truth is that you don’t need the latest and greatest in CNN architectures. For example, if your application only requires detecting the difference between a few different objects with high certainty, even simple detectors such as YOLOv3 unaugmented can do the task you need it to do. Likewise, if super resolution or image denoising is your ultimate goal, you may only need a 10-layer network with fewer than 64 filters per layer. Customers can benefit greatly once they realize that their applications can be solved for a fraction of the computational complexity with much simpler models than what’s on the forefront of research. The goal is to not make the migration to CNNs any harder than it has to be.

Integrate quantization early

Quantizing a model down from multi-byte precisions such as FP32 or BF16 to a single-byte can multiply inference speed with little to no degradation in accuracy. However, many customers get tripped up on quantizing their model because it adds steps in the training and model creation process that can be tricky to implement. For example, frameworks such as PyTorch and ONNX expose their own methods for quantizing models, but they’re not always compatible with each other. Flex Logix has helped customers navigate the different options for quantization (eg, static, dynamic, training-aware). The main takeaways are that you should be consistent with your approach and aim to quantize from the outset of developing your model.

Conclusion

Most customers run into the same set of issues when attempting to bring CNNs to their industry, but Flex Logix can help ease the transition and unlock the potential of AI in your application. Using pre-existing models, optimizing models for your application, and quantizing your workload early will help you get your application running and deliver additional value to your customers.

Vinay Mehta

(all posts)
Vinay Mehta is the inference technical marketing manager at Flex Logix.

Knowledge Centers
Entities, people and technologies explored

Startup Funding: Q1 2025

AI chips and data center communications see big funding; 75 startups raise $2 billion.

by Jesse Allen

Advanced Packaging Fundamentals for Semiconductor Engineers

New SE eBook examines the next phase of semiconductor design, testing, and manufacturing.

by Bryon Moyer

Chip Industry Week in Review

AI export rule to be scrapped; SEMI, EU request; Cadence, Nvidia supercomputer; AI co-processor; Imagination's new GPU; semi sales up; imec, TNO photonics lab; NSF key to national security; flexible packaging control system; SiConic test engineering; USB 4 support; SiC JFETS; magnetic behavior in hematite.

by The SE Staff

Challenges Of Edge AI Inference

Leverage existing models

Simple models are effective

Integrate quantization early

Conclusion

Vinay Mehta

Leave a Reply Cancel reply

Technical Papers

Knowledge Centers
Entities, people and technologies explored

Related Articles

Startup Funding: Q1 2025

Advanced Packaging Fundamentals for Semiconductor Engineers

Chip Industry Week in Review

Chip Industry Week in Review

RISC-V’s Increasing Influence

Chip Industry Week in Review

Big Changes Ahead For Interposers And Substrates

What Exactly Are Chiplets And Heterogeneous Integration?

Sponsors

Recent Comments

About

Navigation

Connect With Us

Challenges Of Edge AI Inference

Leverage existing models

Simple models are effective

Integrate quantization early

Conclusion

Vinay Mehta

Leave a Reply Cancel reply

Technical Papers

Knowledge Centers Entities, people and technologies explored

Related Articles

Startup Funding: Q1 2025

Advanced Packaging Fundamentals for Semiconductor Engineers

Chip Industry Week in Review

Chip Industry Week in Review

RISC-V’s Increasing Influence

Chip Industry Week in Review

Big Changes Ahead For Interposers And Substrates

What Exactly Are Chiplets And Heterogeneous Integration?

Sponsors

Newsletter Signup

Popular Tags

Recent Comments

About

Navigation

Connect With Us

Knowledge Centers
Entities, people and technologies explored