SPONSOR BLOG

Apples, Oranges & The Optimal AI Inference Accelerator

Pay attention to these key areas to determine the right accelerator for your needs.

September 3rd, 2020 - By: Geoff Tate

There are a wide range of AI inference accelerators available and a wide range of applications for them.

No AI inference accelerator will be optimal for every application. For example, a data center class accelerator almost certainly will be too big, burn too much power, and cost too much for most edge applications. And an accelerator optimal for key word recognition won’t have the capability to handle more computationally intensive image CNNs.

Like Goldilocks, the optimal AI inference accelerator for your application needs to be “just right”… for you.

In our experience, many customers already have neural network models they are running and they are looking for more throughput on that model within their cost and power and size constraints.

Apples and oranges: Get the full picture

A recent product announcement touted a frame rate for a popular neural network model ten times faster than Nvidia Xavier at a tiny fraction of the power.

But you need to get the full picture.

If a frame rate is given, you can’t judge how impressive it is without knowing:

The image size.
Is it batch=1 for minimum latency or batch>1 for maximum throughput?
What numerics are being used (INT4, INT8, BF/FP16, …)?
Has the neural network model been altered or is the computation algorithm altered, and if so how and with what effect on accuracy: what post-training optmizations have been applied such as pruning, and how does the accelerator deal with sparsity, both at compile and runtime?
What predication accuracy is achieved, especially if the model or weights have been altered?

If power is given, you need to know:

What are the measurement conditions? Temperature, Voltage, Process.
What model is running?
Is it the power for the inference accelerator core? For the whole chip? Or the whole chip plus DRAM?

The right way to compare two accelerators

Get both vendors to benchmark your neural network model at your image size with your preferred numerics; if they in any way alter the model or the weights, get them to give you the impact on prediction accuracy.

Ask all of the questions above to determine the throughput and power for the operating conditions that matter for your application.

As well, get a demo: if the vendor can’t demo what they claim, that means something isn’t right! When you get a demo ask to see live streams not videos – if you are there in person put your hand in front of the camera to verify it’s real time; if you are on Zoom ask them to put their hand in front of the camera. The point is to verify that inference is actually happening in real time and you’re not watching something pre-canned and in some way modified to look better. Fresh fruit is better than canned fruit.

And get price information, in the same volumes, or if you can’t get that ask what the die size is and in what process to get some sense of relative cost.

The optimum accelerator that is “just right” for you will have the best throughput/$ and the best throughput/watt for your model at your image size at your target prediction accuracy.

Conclusion

Be careful not to jump to conclusions when you hear an impressive performance number without knowing all of the necessary data to judge it. The less information a vendor is giving, probably the more they are hiding.

Get all the data so you can be sure to pick the right fruit for you.

Geoff Tate

(all posts)
Geoff Tate is a technology strategy advisor. He was the founding CEO of Flex Logix (now part of Analog Devices). Before that, he was the founding CEO of Rambus, and prior to that he was senior vice president of AMD's processor group. He received his BSc in computer science from the University of Alberta, and an MBA from Harvard Business School.

Knowledge Centers
Entities, people and technologies explored

Startup Funding: Q1 2025

AI chips and data center communications see big funding; 75 startups raise $2 billion.

by Jesse Allen

Advanced Packaging Fundamentals for Semiconductor Engineers

New SE eBook examines the next phase of semiconductor design, testing, and manufacturing.

by Bryon Moyer

Chip Industry Week in Review

AI export rule to be scrapped; SEMI, EU request; Cadence, Nvidia supercomputer; AI co-processor; Imagination's new GPU; semi sales up; imec, TNO photonics lab; NSF key to national security; flexible packaging control system; SiConic test engineering; USB 4 support; SiC JFETS; magnetic behavior in hematite.

by The SE Staff

Apples, Oranges & The Optimal AI Inference Accelerator

Apples and oranges: Get the full picture

The right way to compare two accelerators

Conclusion

Geoff Tate

Leave a Reply Cancel reply

Technical Papers

Knowledge Centers
Entities, people and technologies explored

Related Articles

Startup Funding: Q1 2025

Advanced Packaging Fundamentals for Semiconductor Engineers

Chip Industry Week in Review

Chip Industry Week in Review

RISC-V’s Increasing Influence

Chip Industry Week in Review

What Exactly Are Chiplets And Heterogeneous Integration?

Big Changes Ahead For Interposers And Substrates

Sponsors

Recent Comments

About

Navigation

Connect With Us

Apples, Oranges & The Optimal AI Inference Accelerator

Apples and oranges: Get the full picture

The right way to compare two accelerators

Conclusion

Geoff Tate

Leave a Reply Cancel reply

Technical Papers

Knowledge Centers Entities, people and technologies explored

Related Articles

Startup Funding: Q1 2025

Advanced Packaging Fundamentals for Semiconductor Engineers

Chip Industry Week in Review

Chip Industry Week in Review

RISC-V’s Increasing Influence

Chip Industry Week in Review

What Exactly Are Chiplets And Heterogeneous Integration?

Big Changes Ahead For Interposers And Substrates

Sponsors

Newsletter Signup

Popular Tags

Recent Comments

About

Navigation

Connect With Us

Knowledge Centers
Entities, people and technologies explored