Inferencing Efficiency


Geoff Tate, CEO of Flex Logix, talks with Semiconductor Engineering about how to measure efficiency in inferencing chips, how to achieve the most throughput for the lowest cost, and what the benchmarks really show. » read more

Edge Complexity To Grow For 5G


Edge computing is becoming as critical to the success of 5G as millimeter-wave technology will be to the success of the edge. In fact, it increasingly looks as if neither will succeed without the other. 5G networks won’t be able to meet 3GPP’s 4-millisecond-latency rule without some layer to deliver the data, run the applications and broker the complexities of multi-tier Internet apps ac... » read more

TOPS, Memory, Throughput And Inference Efficiency


Dozens of companies have or are developing IP and chips for Neural Network Inference. Almost every AI company gives TOPS but little other information. What is TOPS? It means Trillions or Tera Operations per Second. It is primarily a measure of the maximum achievable throughput but not a measure of actual throughput. Most operations are MACs (multiply/accumulates), so TOPS = (number of MAC... » read more

Data Confusion At The Edge


Disparities in pre-processing of data at the edge, coupled with a total lack of standardization, are raising questions about how that data will be prioritized and managed in AI and machine learning systems. Initially, the idea was that 5G would connect edge data to the cloud, where massive server farms would infer patterns from that data and send it back to the edge devices. But there is far... » read more

Do Large Batches Always Improve Neural Network Throughput?


Common benchmarks like ResNet-50 generally have much higher throughput with large batch sizes than with batch size =1. For example, the Nvidia Tesla T4 has 4x the throughput at batch=32 than when it is processing in batch=1 mode. Of course, larger batch sizes have a tradeoff: latency increases which may be undesirable in real-time applications. Why do larger batches increase throughput... » read more

Accelerating Endpoint Inferencing


Chipmakers are getting ready to debut inference chips for endpoint devices, even though the rest of the machine-learning ecosystem has yet to be established. Whatever infrastructure does exist today is mostly in the cloud, on edge-computing gateways, or in company-specific data centers, which most companies continue to use. For example, Tesla has its own data center. So do most major carmake... » read more

Inferencing At The Edge


Geoff Tate, CEO of Flex Logix, talks about the challenges of power and performance at the edge, why this market is so important from a business and technology standpoint, and what factors need to be balanced. » read more

How To Integrate An Embedded FPGA


Choosing to add programmable logic into an SoC with an eFPGA is just the beginning. Other choices follow involving how many lookup tables (LUTs), how much routing and what topology, how will data be transferred in and out of the fabric, does data need to be coherent with system memory, how will it be programmed and tested, and what RTL functions need to be embedded into the programmable fabric ... » read more

Is ADAS The Edge?


Debate is brewing over whether ADAS applications fall on the edge, or if they are better viewed squarely within the context of the automotive camp. There is more to this discussion than just semantics. The edge represents a huge greenfield opportunity for electronics of all sorts, and companies from the mobile market and from the cloud are both rushing to stake their claim. At this point the... » read more

Neural Network Performance Modeling Software


nnMAX Inference IP is nearing design completion. The nnMAX 1K tile will be available this summer for design integration in SoCs, and it can be arrayed to provide whatever inference throughput is desired. The InferX X1 chip will tape out late Q3 this year using 2x2 nnMAX tiles, for 4K MACs, with 8MB SRAM. The nnMAX Compiler is in development in parallel, and the first release is available now... » read more

← Older posts