Neural Network Performance Modeling Software

By Geoff Tate - 08 May, 2019 - Comments: 0

nnMAX Inference IP is nearing design completion. The nnMAX 1K tile will be available this summer for design integration in SoCs, and it can be arrayed to provide whatever inference throughput is desired. The InferX X1 chip will tape out late Q3 this year using 2x2 nnMAX tiles, for 4K MACs, with 8MB SRAM. The nnMAX Compiler is in development in parallel, and the first release is available now... » read more

Multi-Layer Processing Boosts Inference Throughput/Watt

By Geoff Tate - 15 Apr, 2019 - Comments: 0

The focus in discussion of inference throughput is often on the computations required. For example, YOLOv3, a power real time object detection and recognition model, requires 227 BILLION MACs (multiply-accumulates) to process a single 2 Mega Pixel image! This is with the Winograd Transformation; it’s more than 300 Billion without it. And there is a lot of discussion of the large size ... » read more

Inference Acceleration: Follow The Memory

By Geoff Tate - 07 Mar, 2019 - Comments: 0

Much has been written about the computational complexity of inference acceleration: very large matrix multiplies for fully-connected layers and huge numbers of 3x3 convolutions across megapixel images, both of which require many thousands of MACs (multiplier-accumulators) to achieve high throughput for models like ResNet-50 and YOLOv3. The other side of the coin is managing the movement of d... » read more

Use Inference Benchmarks Similar To Your Application

By Geoff Tate - 07 Feb, 2019 - Comments: 0

If an Inference IP supplier or Inference Accelerator Chip supplier offers a benchmark, it is probably ResNet-50. As a result, it might seem logical to use ResNet-50 to compare inference offerings. If you plan to use ResNet-50 it would be; but if your target application model is significantly different from Resnet-50 it could lead you to pick an inference offering that is not best for you. ... » read more

Lies, Damn Lies, And TOPS/Watt

By Geoff Tate - 07 Jan, 2019 - Comments: 1

There are almost a dozen vendors promoting inferencing IP, but none of them gives even a ResNet-50 benchmark. The only information they state typically is TOPS (Tera-Operations/Second) and TOPS/Watt. These two indicators of performance and power efficiency are almost useless by themselves. So what, exactly, does X TOPS really tell you about performance for your application? When a vendor ... » read more

High Neural Inferencing Throughput At Batch=1

By Geoff Tate - 06 Dec, 2018 - Comments: 0

Microsoft presented the following slide as part of their Brainwave presentation at Hot Chips this summer: In existing inferencing solutions, high throughput (and high % utilization of the hardware) is possible for large batch sizes: this means that instead of processing say one image at a time, the inferencing engine processes say 10 or 50 images in parallel. This minimizes the number of... » read more

Real-Time Object Recognition At Low Cost/Power/Latency

By Geoff Tate - 01 Nov, 2018 - Comments: 0

Most neural network chips and IP talk about ResNet-50 benchmarks (image classification at 224x224 pixels). But we find that the number one neural network of interest for most customers is real-time object recognition, such as YOLOv3. It's not possible to do comparisons here because nobody shows a YOLOv3 benchmark for their inferencing. But it's very possible to improve on the inferencing per... » read more

Reconfigurable eFPGA For Aerospace Applications

By Geoff Tate - 12 Oct, 2018 - Comments: 0

Market research reports indicate about 10% of all dollar revenue of FPGA chips is for use in aerospace applications, and DARPA/DoD reports indicate about one-third of all dollar volume of ICs purchased by U.S. aerospace are FPGAs. FPGAs clearly are very important for aerospace applications because of a combination of short development time and the long mission life of many aerospace applica... » read more

Flexible, Energy-Efficient Neural Network Processing At 16nm

By Geoff Tate - 06 Sep, 2018 - Comments: 0

At Hot Chips 30, held in August in Silicon Valley, Harvard University (Paul Whatmough, SK Lee, S Xi, U Gupta, L Pentecost, M Donato, HC Hseuh, Professor Brooks and Professor Gu) made a presentation on “SMIV: A 16nm SoC with Efficient and Flexible DNN Acceleration for Intelligent IOT Devices. ” (Their complete presentation is available now on the Hot Chips website for attendees and will be p... » read more

Sandia Labs’ New Configurable SoC

By Geoff Tate - 02 Aug, 2018 - Comments: 0

At DAC 2018, held in June in San Francisco, Sandia Labs made a public presentation for the first time describing its first SoC using eFPGA, called Dragonfly. This is the first public disclosure by any organization describing its requirements, architecture and use cases for the new technology option of embedded FPGA. John Teifel led the project for Sandia National Laboratories. Sandia has ... » read more

← Older posts Newer posts →

category: Flexible Chips

category: IoT, Security & Automotive

Neural Network Performance Modeling Software

Multi-Layer Processing Boosts Inference Throughput/Watt

Inference Acceleration: Follow The Memory

Use Inference Benchmarks Similar To Your Application

Lies, Damn Lies, And TOPS/Watt

High Neural Inferencing Throughput At Batch=1

Real-Time Object Recognition At Low Cost/Power/Latency

Reconfigurable eFPGA For Aerospace Applications

Flexible, Energy-Efficient Neural Network Processing At 16nm

Sandia Labs’ New Configurable SoC

Trending Articles

Electromigration Concerns Grow In Advanced Packages

What Works Best For Chiplets

Chip Industry Week In Review

EDA Looks Beyond Chips

Architecting Chips For High-Performance Computing

Knowledge Centers
Entities, people and technologies explored

Related Articles

The Rising Price Of Power In Chips

Chiplet IP Standards Are Just The Beginning

Electromigration Concerns Grow In Advanced Packages

Silicon Photonics Manufacturing Ramps Up

Backside Power Delivery Gears Up For 2nm Devices

X-ray Inspection In The Semiconductor Industry

SRAM Scaling Issues, And What Comes Next

What Works Best For Chiplets

Sponsors

Recent Comments

About

Navigation

Connect With Us

category: Flexible Chips

category: IoT, Security & Automotive

Neural Network Performance Modeling Software

Multi-Layer Processing Boosts Inference Throughput/Watt

Inference Acceleration: Follow The Memory

Use Inference Benchmarks Similar To Your Application

Lies, Damn Lies, And TOPS/Watt

High Neural Inferencing Throughput At Batch=1

Real-Time Object Recognition At Low Cost/Power/Latency

Reconfigurable eFPGA For Aerospace Applications

Flexible, Energy-Efficient Neural Network Processing At 16nm

Sandia Labs’ New Configurable SoC

Trending Articles

Electromigration Concerns Grow In Advanced Packages

What Works Best For Chiplets

Chip Industry Week In Review

EDA Looks Beyond Chips

Architecting Chips For High-Performance Computing

Knowledge Centers Entities, people and technologies explored

Related Articles

The Rising Price Of Power In Chips

Chiplet IP Standards Are Just The Beginning

Electromigration Concerns Grow In Advanced Packages

Silicon Photonics Manufacturing Ramps Up

Backside Power Delivery Gears Up For 2nm Devices

X-ray Inspection In The Semiconductor Industry

SRAM Scaling Issues, And What Comes Next

What Works Best For Chiplets

Sponsors

Newsletter Signup

Popular Tags

Recent Comments

About

Navigation

Connect With Us

Knowledge Centers
Entities, people and technologies explored