ISA and Microarchitecture Extensions Over Dense Matrix Engines to Support Flexible Structured Sparsity for CPUs (Georgia Tech, Intel Labs)


A technical paper titled "VEGETA: Vertically-Integrated Extensions for Sparse/Dense GEMM Tile Acceleration on CPUs" was published (preprint) by researchers at Georgia Tech and Intel Labs. Abstract: "Deep Learning (DL) acceleration support in CPUs has recently gained a lot of traction, with several companies (Arm, Intel, IBM) announcing products with specialized matrix engines accessible v... » read more

Review of Tools & Techniques for DL Edge Inference


A new technical paper titled "Efficient Acceleration of Deep Learning Inference on Resource-Constrained Edge Devices: A Review" was published in "Proceedings of the IEEE" by researchers at University of Missouri and Texas Tech University. Abstract: Successful integration of deep neural networks (DNNs) or deep learning (DL) has resulted in breakthroughs in many areas. However, deploying thes... » read more

DeepGBASS: Deep Guided Boundary-Aware Semantic Segmentation


Image semantic segmentation is ubiquitously used in scene understanding applications, such as AI Camera, which require high accuracy and efficiency. Deep learning has significantly advanced the state-of-the-art in semantic segmentation. However, many of recent semantic segmentation works only consider class accuracy and ignore the accuracies at the boundaries between semantic classes. To improv... » read more

Will Floating Point 8 Solve AI/ML Overhead?


While the media buzzes about the Turing Test-busting results of ChatGPT, engineers are focused on the hardware challenges of running large language models and other deep learning networks. High on the ML punch list is how to run models more efficiently using less power, especially in critical applications like self-driving vehicles where latency becomes a matter of life or death. AI already ... » read more

Using Silicon Photonics To Reduce Latency On Edge Devices


A new technical paper titled "Delocalized photonic deep learning on the internet’s edge" was published by researchers at MIT and Nokia Corporation. “Every time you want to run a neural network, you have to run the program, and how fast you can run the program depends on how fast you can pipe the program in from memory. Our pipe is massive — it corresponds to sending a full feature-leng... » read more

FP8: Cross-Industry Hardware Specification For AI Training And Inference (Arm, Intel, Nvidia)


Arm, Intel, and Nvidia proposed a specification for an 8-bit floating point (FP8) format that could provide a common interchangeable format that works for both AI training and inference and allow AI models to operate and perform consistently across hardware platforms. Find the technical paper titled " FP8 Formats For Deep Learning" here. Published Sept 2022. Abstract: "FP8 is a natural p... » read more

Convolutional Neural Networks: Co-Design of Hardware Architecture and Compression Algorithm


Researchers at Soongsil University (Korea) published "A Survey on Efficient Convolutional Neural Networks and Hardware Acceleration." Abstract: "Over the past decade, deep-learning-based representations have demonstrated remarkable performance in academia and industry. The learning capability of convolutional neural networks (CNNs) originates from a combination of various feature extraction... » read more

Deep Learning To Classify And Establish Structure Property Predictions With PeakForce QNM Atomic Force Microscopy


Machine learning and specifically, deep learning, is a powerful tool to establish the presence (or absence) of microstructure correlations to bulk properties with its ability to flesh out relationships and trends that are difficult to establish otherwise. This application note discusses the use of deep learning tools, to explore AFM phase and PeakForce Quantitative Nanomechanics (QNM) im... » read more

DNN-Opt, A Novel Deep Neural Network (DNN) Based Black-Box Optimization Framework For Analog Sizing


This technical paper titled "DNN-Opt: An RL Inspired Optimization for Analog Circuit Sizing using Deep Neural Networks" is co-authored from researchers at The University of Texas at Austin, Intel, University of Glasgow. The paper was a best paper candidate at DAC 2021. "In this paper, we present DNN-Opt, a novel Deep Neural Network (DNN) based black-box optimization framework for analog sizi... » read more

New Uses For AI In Chips


Artificial intelligence is being deployed across a number of new applications, from improving performance and reducing power in a wide range of end devices to spotting irregularities in data movement for security reasons. While most people are familiar with using machine learning and deep learning to distinguish between cats and dogs, emerging applications show how this capability can be use... » read more

← Older posts Newer posts →