Challenges In Developing A New Inferencing Chip


Cheng Wang, co-founder and senior vice president of software and engineering at Flex Logix, sat down with Semiconductor Engineering to explain the process of bringing an inferencing accelerator chip to market, from bring-up, programming and partitioning to tradeoffs involving speed and customization.   SE: Edge inferencing chips are just starting to come to market. What challenges di... » read more

Architectural Considerations For AI


Custom chips, labeled as artificial intelligence (AI) or machine learning (ML), are appearing on a weekly basis, each claiming to be 10X faster than existing devices or consume 1/10 the power. Whether that is enough to dethrone existing architectures, such as GPUs and FPGAs, or whether they will survive alongside those architectures isn't clear yet. The problem, or the opportunity, is that t... » read more

Why Reconfigurability Is Essential For AI Edge Inference Throughput


For a neural network to run at its fastest, the underlying hardware must run efficiently on all layers. Through the inference of any CNN—whether it be based on an architecture such as YOLO, ResNet, or Inception—the workload regularly shifts from being bottlenecked by memory to being bottlenecked by compute resources. You can think of each convolutional layer as its own mini-workload, and so... » read more

Applications, Challenges For Using AI In Fabs


Experts at the Table: Semiconductor Engineering sat down to discuss chip scaling, transistors, new architectures, and packaging with Jerry Chen, head of global business development for manufacturing & industrials at Nvidia; David Fried, vice president of computational products at Lam Research; Mark Shirey, vice president of marketing and applications at KLA; and Aki Fujimura, CEO of D2S. Wh... » read more

Maximizing Edge AI Performance


Inference of convolutional neural network models is algorithmically straightforward, but to get the fastest performance for your application there are a few pitfalls to keep in mind when deploying. A number of factors make efficient inference difficult, which we will first step through before diving into specific solutions to address and resolve each. By the end of this article, you will be arm... » read more

The Best AI Edge Inference Benchmark


When evaluating the performance of an AI accelerator, there’s a range of methodologies available to you. In this article, we’ll discuss some of the different ways to structure your benchmark research before moving forward with an evaluation that directly runs your own model. Just like when buying a car, research will only get you so far before you need to get behind the wheel and give your ... » read more

Tapping Into Purpose-Built Neural Network Models For Even Bigger Efficiency Gains


Neural networks can be categorized as a set of algorithms modelled loosely after the human brain that can ‘learn’ by incorporating new data. Indeed, many benefits can be derived from developing purpose-built “computationally efficient” neural network models. However, to ensure your model is effective, there are several key requirements that need to be considered. One critical conside... » read more

Edge Inference Applications And Market Segmentation


Until recently, most AI was in data centers/cloud and most of that was training. Things are changing quickly. Projections are AI sales will grow rapidly to tens of billions of dollars by the mid 2020s, with most of the growth in edge AI inference. Data center/cloud vs. edge inference: What’s the difference? The data center/cloud is where inference started on Xeons. To gain efficiency, much ... » read more

Convolutional Neural Network With INT4 Optimization


Xilinx provides an INT8 AI inference accelerator on Xilinx hardware platforms — Deep Learning Processor Unit (XDPU). However, in some resource-limited, high-performance and low-latency scenarios (such as the resource-power-sensitive edge side and low-latency ADAS scenario), low bit quantization of neural networks is required to achieve lower power consumption and higher performance than provi... » read more

ResNet-50 Does Not Predict Inference Throughput For MegaPixel Neural Network Models


Customers are considering applications for AI inference and want to evaluate multiple inference accelerators. As we discussed last month, TOPS do NOT correlate with inference throughput and you should use real neural network models to benchmark accelerators. So is ResNet-50 a good benchmark for evaluating relative performance of inference accelerators? If your application is going to p... » read more

← Older posts Newer posts →