Author's Latest Posts


One More Time: TOPS Do Not Predict Inference Throughput


Many times you’ll hear vendors talking about how many TOPS their chip has and imply that more TOPS means better inference performance. If you use TOPS to pick your AI inference chip, you will likely not be happy with what you get. Recently, Vivienne Sze, a professor at MIT, gave an excellent talk entitled “How to Evaluate Efficient Deep Neural Network Approaches.” Slides are also av... » read more

Apples, Oranges & The Optimal AI Inference Accelerator


There are a wide range of AI inference accelerators available and a wide range of applications for them. No AI inference accelerator will be optimal for every application. For example, a data center class accelerator almost certainly will be too big, burn too much power, and cost too much for most edge applications. And an accelerator optimal for key word recognition won’t have the capabil... » read more

Integrating FPGA: Comparison Of Chiplets Vs. eFPGA


FPGA is widely popular in systems for its flexibility and adaptability. Increasingly, it is being used in high volume applications. As volumes grow, system designers can consider integration of the FPGA into an SoC to reduce cost, reduce power and/or improve performance. There are two options for integrating FPGA into an SoC: FPGA chiplets, which replace the power hungry SERDES/PHYs wit... » read more

eFPGA As Fast And Dense As FPGA, On Any Process Node


A challenge for eFPGA when we started Flex Logix is that there are many customers and applications, and they all seemed to want eFPGA on different foundries, different nodes and different array sizes. And everyone wanted the eFPGA to be as fast and as dense as FPGA leaders’ on the same node. Oh, and customers seem to wait to the last minute then need the eFPGA ASAP. Xilinx and Altera (Intel ... » read more

Increasing eFPGA Adoption Will Shape eFPGA Features/Benefits


eFPGA adoption is accelerating. eFPGA is now available from multiple suppliers for multiple foundries and on nodes including 180nm, 40nm, 28nm, 22nm, 16nm, 12nm and 7nm. There are double-digit chips proven in silicon by multiple customers for multiple applications. And many more in fab, in design and in planning. The three main applications are: Integration of existing FPGA chips int... » read more

AI Inference: Pools Vs. Streams


Deep Learning and AI Inference originated in the data center and was first deployed in practical, volume applications in the data center. Only recently has Inference begun to spread to Edge applications (anywhere outside of the data center). In the data center much of the data to be processed is a “pool” of data. For example, when you see your photo album tagged with all of the pictures ... » read more

Software Is At Least As Important As Hardware For Inference Accelerators


In articles and conference presentations on Inference Accelerators, the focus is primarily on TOPS (frequency times number of MACs), a little bit on memory (DRAM interfaces and on chip SRAM), very little on interconnect (also very important, but that’s another story) and almost nothing on the software! Without software, the inference accelerator is a rock that does nothing. Software is wha... » read more

Where Is The eFPGA Market And Ecosystem Headed?


In this article we’ll discuss the availability of eFPGA, the applications for eFPGA and the current and future market size for eFPGA. eFPGA vendors & offerings Embedded FPGA is a new development this decade. There are now multiple vendors offering eFPGA on a wide range of process nodes with multiple customers. eFPGA Vendors Menta has had eFPGA available for the longest: their offe... » read more

Where Is The Edge AI Market And Ecosystem Headed?


Until recently, most AI was in datacenters and most was training. Things are changing quickly. Projections are AI sales will grow rapidly to $10s of billions by the mid 2020s, with most of the growth in Edge AI Inference. Edge inference applications Where is the Edge Inference market today? Let’s look at the markets from highest throughput to lowest. Edge Servers Recently Nvidia annou... » read more

Modeling AI Inference Performance


The metric in AI Inference that matters to customers is either throughput/$ for their model and/or throughput/watts for their model. One might assume throughput will correlate with TOPS, but you’d be wrong. Examine the table below: The Nvidia Tesla T4 gets 7.4 inferences/TOP, Xavier AGX 15 and InferX 1 34.5. And InferX X1 does it with 1/10th to 1/20th of the DRAM bandwidth of the ... » read more

← Older posts Newer posts →