Software Is At Least As Important As Hardware For Inference Accelerators

By Geoff Tate - 06 Feb, 2020 - Comments: 0

In articles and conference presentations on Inference Accelerators, the focus is primarily on TOPS (frequency times number of MACs), a little bit on memory (DRAM interfaces and on chip SRAM), very little on interconnect (also very important, but that’s another story) and almost nothing on the software! Without software, the inference accelerator is a rock that does nothing. Software is wha... » read more

Where Is The eFPGA Market And Ecosystem Headed?

By Geoff Tate - 09 Jan, 2020 - Comments: 0

In this article we’ll discuss the availability of eFPGA, the applications for eFPGA and the current and future market size for eFPGA. eFPGA vendors & offerings Embedded FPGA is a new development this decade. There are now multiple vendors offering eFPGA on a wide range of process nodes with multiple customers. eFPGA Vendors Menta has had eFPGA available for the longest: their offe... » read more

Where Is The Edge AI Market And Ecosystem Headed?

By Geoff Tate - 05 Dec, 2019 - Comments: 1

Until recently, most AI was in datacenters and most was training. Things are changing quickly. Projections are AI sales will grow rapidly to $10s of billions by the mid 2020s, with most of the growth in Edge AI Inference. Edge inference applications Where is the Edge Inference market today? Let’s look at the markets from highest throughput to lowest. Edge Servers Recently Nvidia annou... » read more

Modeling AI Inference Performance

By Geoff Tate - 07 Nov, 2019 - Comments: 0

The metric in AI Inference that matters to customers is either throughput/$ for their model and/or throughput/watts for their model. One might assume throughput will correlate with TOPS, but you’d be wrong. Examine the table below: The Nvidia Tesla T4 gets 7.4 inferences/TOP, Xavier AGX 15 and InferX 1 34.5. And InferX X1 does it with 1/10th to 1/20th of the DRAM bandwidth of the ... » read more

Advantages Of BFloat16 For AI Inference

By Geoff Tate - 03 Oct, 2019 - Comments: 0

Essentially all AI training is done with 32-bit floating point. But doing AI inference with 32-bit floating point is expensive, power-hungry and slow. And quantizing models for 8-bit-integer, which is very fast and lowest power, is a major investment of money, scarce resources and time. Now BFloat16 (BF16) offers an attractive balance for many users. BFloat16 offers essentially t... » read more

eFPGA Macros Deliver Higher Speeds from Less Area/Resources

By Geoff Tate - 05 Sep, 2019 - Comments: 0

We work with a lot of customers designing eFPGA into their SoCs. Most of them have “random logic” RTL, but some customers have large numbers of complex, frequently used blocks. We have found in many cases that we can help the customer achieve higher throughput AND use less silicon area with Soft Macros. Let’s look at an example: 64x64 Multiply-Accumulate (MAC), below: If yo... » read more

AI Inference Memory System Tradeoffs

By Geoff Tate - 01 Aug, 2019 - Comments: 1

When companies describe their AI inference chip they typically give TOPS but don’t talk about their memory system, which is equally important. What is TOPS? It means Trillions or Tera Operations per Second. It is primarily a measure of the maximum achievable throughput but not a measure of actual throughput. Most operations are MACs (multiply/accumulates), so TOPS = (number of MAC units) x... » read more

TOPS, Memory, Throughput And Inference Efficiency

By Geoff Tate - 02 Jul, 2019 - Comments: 3

Dozens of companies have or are developing IP and chips for Neural Network Inference. Almost every AI company gives TOPS but little other information. What is TOPS? It means Trillions or Tera Operations per Second. It is primarily a measure of the maximum achievable throughput but not a measure of actual throughput. Most operations are MACs (multiply/accumulates), so TOPS = (number of MAC... » read more

Do Large Batches Always Improve Neural Network Throughput?

By Geoff Tate - 06 Jun, 2019 - Comments: 1

Common benchmarks like ResNet-50 generally have much higher throughput with large batch sizes than with batch size =1. For example, the Nvidia Tesla T4 has 4x the throughput at batch=32 than when it is processing in batch=1 mode. Of course, larger batch sizes have a tradeoff: latency increases which may be undesirable in real-time applications. Why do larger batches increase throughput... » read more

Neural Network Performance Modeling Software

By Geoff Tate - 08 May, 2019 - Comments: 0

nnMAX Inference IP is nearing design completion. The nnMAX 1K tile will be available this summer for design integration in SoCs, and it can be arrayed to provide whatever inference throughput is desired. The InferX X1 chip will tape out late Q3 this year using 2x2 nnMAX tiles, for 4K MACs, with 8MB SRAM. The nnMAX Compiler is in development in parallel, and the first release is available now... » read more

← Older posts Newer posts →

category: Flexible Chips

category: IoT, Security & Automotive

Software Is At Least As Important As Hardware For Inference Accelerators

Where Is The eFPGA Market And Ecosystem Headed?

Where Is The Edge AI Market And Ecosystem Headed?

Modeling AI Inference Performance

Advantages Of BFloat16 For AI Inference

eFPGA Macros Deliver Higher Speeds from Less Area/Resources

AI Inference Memory System Tradeoffs

TOPS, Memory, Throughput And Inference Efficiency

Do Large Batches Always Improve Neural Network Throughput?

Neural Network Performance Modeling Software

Trending Articles

Chip Industry Week In Review

What Works Best For Chiplets

Architecting Chips For High-Performance Computing

Chip Industry Week In Review

Chip Industry Week In Review

Knowledge Centers
Entities, people and technologies explored

Related Articles

Money Pours Into New Fabs And Facilities

The Rising Price Of Power In Chips

Chiplet IP Standards Are Just The Beginning

The Future Of Memory

Backside Power Delivery Gears Up For 2nm Devices

Silicon Photonics Manufacturing Ramps Up

Visa Shakeup On Tap To Help Solve Worker Shortage

X-ray Inspection In The Semiconductor Industry

Sponsors

Recent Comments

About

Navigation

Connect With Us

category: Flexible Chips

category: IoT, Security & Automotive

Software Is At Least As Important As Hardware For Inference Accelerators

Where Is The eFPGA Market And Ecosystem Headed?

Where Is The Edge AI Market And Ecosystem Headed?

Modeling AI Inference Performance

Advantages Of BFloat16 For AI Inference

eFPGA Macros Deliver Higher Speeds from Less Area/Resources

AI Inference Memory System Tradeoffs

TOPS, Memory, Throughput And Inference Efficiency

Do Large Batches Always Improve Neural Network Throughput?

Neural Network Performance Modeling Software

Trending Articles

Chip Industry Week In Review

What Works Best For Chiplets

Architecting Chips For High-Performance Computing

Chip Industry Week In Review

Chip Industry Week In Review

Knowledge Centers Entities, people and technologies explored

Related Articles

Money Pours Into New Fabs And Facilities

The Rising Price Of Power In Chips

Chiplet IP Standards Are Just The Beginning

The Future Of Memory

Backside Power Delivery Gears Up For 2nm Devices

Silicon Photonics Manufacturing Ramps Up

Visa Shakeup On Tap To Help Solve Worker Shortage

X-ray Inspection In The Semiconductor Industry

Sponsors

Newsletter Signup

Popular Tags

Recent Comments

About

Navigation

Connect With Us

Knowledge Centers
Entities, people and technologies explored