Defining And Improving AI Performance


Many companies are developing AI chips, both for training and for inference. Although getting the required functionality is important, many solutions will be judged by their performance characteristics. Performance can be measured in different ways, such as number of inferences per second or per watt. These figures are dependent on a lot of factors, not just the hardware architecture. The optim... » read more

MLPerf Benchmarks


Geoff Tate, CEO of Flex Logix, talks about the new MLPerf benchmark, what’s missing from the benchmark, and which ones are relevant to edge inferencing. » read more

Where Is The Edge AI Market And Ecosystem Headed?


Until recently, most AI was in datacenters and most was training. Things are changing quickly. Projections are AI sales will grow rapidly to $10s of billions by the mid 2020s, with most of the growth in Edge AI Inference. Edge inference applications Where is the Edge Inference market today? Let’s look at the markets from highest throughput to lowest. Edge Servers Recently Nvidia annou... » read more

Using FPGAs For AI


Artificial intelligence (AI) and machine learning (ML) are progressing at a rate that is outstripping Moore's Law. In fact, they now are evolving faster than silicon can be designed. The industry is looking at all possibilities to provide devices that have the necessary accuracy and performance, as well as a power budget that can be sustained. FPGAs are promising, but they also have some sig... » read more

Modeling AI Inference Performance


The metric in AI Inference that matters to customers is either throughput/$ for their model and/or throughput/watts for their model. One might assume throughput will correlate with TOPS, but you’d be wrong. Examine the table below: The Nvidia Tesla T4 gets 7.4 inferences/TOP, Xavier AGX 15 and InferX 1 34.5. And InferX X1 does it with 1/10th to 1/20th of the DRAM bandwidth of the ... » read more

Revving Up For Edge Computing


The edge is beginning to take shape as a way of limiting the amount of data that needs to be pushed up to the cloud for processing, setting the stage for a massive shift in compute architectures and a race among chipmakers for a stake in a new and highly lucrative market. So far, it's not clear which architectures will win, or how and where data will be partitioned between what needs to be p... » read more

eFPGA as Silicon Debugger


A variety of debugging features are available by implementing the Flex Logix embedded FPGA cores. This includes: • High observability—the many eFPGA IOs allow for thousands of signals to be monitored • Event detection—the eFPGA logic can be configured to detect very complex signal patterns to help identify meaningful events • High visibility—data can be logged by the eLA inside t... » read more

Using Multiple Inferencing Chips In Neural Networks


Geoff Tate, CEO of Flex Logix, talks about what happens when you add multiple chips in a neural network, what a neural network model looks like, and what happens when it’s designed correctly vs. incorrectly. » read more

Advantages Of BFloat16 For AI Inference


Essentially all AI training is done with 32-bit floating point. But doing AI inference with 32-bit floating point is expensive, power-hungry and slow. And quantizing models for 8-bit-integer, which is very fast and lowest power, is a major investment of money, scarce resources and time. Now BFloat16 (BF16) offers an attractive balance for many users. BFloat16 offers essentially t... » read more

Week In Review: Design, Low Power


eSilicon debuted its 7nm high-bandwidth interconnect (HBI)+ PHY IP, a special-purpose hard IP block that offers a high-bandwidth, low-power and low-latency wide-parallel, clock-forwarded PHY interface for 2.5D applications such as chiplets. HBI+ PHY delivers a data rate of up to 4.0Gbps per pin. Flexible configurations include up to 80 receive and 80 transmit connections per channel and up to 2... » read more

← Older posts Newer posts →