FP8: Cross-Industry Hardware Specification For AI Training And Inference (Arm, Intel, Nvidia)


Arm, Intel, and Nvidia proposed a specification for an 8-bit floating point (FP8) format that could provide a common interchangeable format that works for both AI training and inference and allow AI models to operate and perform consistently across hardware platforms. Find the technical paper titled " FP8 Formats For Deep Learning" here. Published Sept 2022. Abstract: "FP8 is a natural p... » read more

There’s More To Machine Learning Than CNNs


Neural networks – and convolutional neural networks (CNNs) in particular – have received an abundance of attention over the last few years, but they're not the only useful machine-learning structures. There are numerous other ways for machines to learn how to solve problems, and there is room for alternative machine-learning structures. “Neural networks can do all this really comple... » read more

Making Sense Of New Edge-Inference Architectures


New edge-inference machine-learning architectures have been arriving at an astounding rate over the last year. Making sense of them all is a challenge. To begin with, not all ML architectures are alike. One of the complicating factors in understanding the different machine-learning architectures is the nomenclature used to describe them. You’ll see terms like “sea-of-MACs,” “systolic... » read more

Edge-Inference Architectures Proliferate


First part of two parts. The second part will dive into basic architectural characteristics. The last year has seen a vast array of announcements of new machine-learning (ML) architectures for edge inference. Unburdened by the need to support training, but tasked with low latency, the devices exhibit extremely varied approaches to ML inference. “Architecture is changing both in the comp... » read more

The MCU Dilemma


The humble microcontroller is getting squeezed on all sides. While most of the semiconductor industry has been able to take advantage of Moore's Law, the MCU market has faltered because flash memory does not scale beyond 40nm. At the same time, new capabilities such as voice activation and richer sensor networks are requiring inference engines to be integrated for some markets. In others, re... » read more

IP Management And Development At 5/3nm


The growing complexity of moving to new process nodes is making it much more difficult to create, manage and re-use IP. There are more rules, more data to manage, and more potential interactions as density increases, both in planar implementations and in advanced packaging. And the problems only get worse as designs move to 5nm and 3nm, and as more heterogeneous components such as accelerato... » read more

Edge Inferencing Challenges


Geoff Tate, CEO of Flex Logix, talks about balancing different variables to improve performance and reduce power at the lowest cost possible in order to do inferencing in edge devices. https://youtu.be/1BTxwew--5U » read more

Making AI Run Faster


The semiconductor industry has woken up to the fact that heterogeneous computing is the way forward and that inferencing will require more than a GPU or a CPU. The numbers being bandied about by the 30 or so companies working on this problem are 100X improvements in performance. But how to get there isn't so simple. It requires four major changes, as well as some other architectural shifts. ... » read more

Speeding Up Neural Networks


Neural networking is gaining traction as the best way of collecting and moving critical data from the physical world and processing it in the digital world. Now the question is how to speed up this whole process. But it isn't a straightforward engineering challenge. Neural networking itself is in a state of almost constant flux and development, which makes it something of a moving target. Th... » read more