DeepGBASS: Deep Guided Boundary-Aware Semantic Segmentation


Image semantic segmentation is ubiquitously used in scene understanding applications, such as AI Camera, which require high accuracy and efficiency. Deep learning has significantly advanced the state-of-the-art in semantic segmentation. However, many of recent semantic segmentation works only consider class accuracy and ignore the accuracies at the boundaries between semantic classes. To improv... » read more

Will Floating Point 8 Solve AI/ML Overhead?


While the media buzzes about the Turing Test-busting results of ChatGPT, engineers are focused on the hardware challenges of running large language models and other deep learning networks. High on the ML punch list is how to run models more efficiently using less power, especially in critical applications like self-driving vehicles where latency becomes a matter of life or death. AI already ... » read more

Using Silicon Photonics To Reduce Latency On Edge Devices


A new technical paper titled "Delocalized photonic deep learning on the internet’s edge" was published by researchers at MIT and Nokia Corporation. “Every time you want to run a neural network, you have to run the program, and how fast you can run the program depends on how fast you can pipe the program in from memory. Our pipe is massive — it corresponds to sending a full feature-leng... » read more

FP8: Cross-Industry Hardware Specification For AI Training And Inference (Arm, Intel, Nvidia)


Arm, Intel, and Nvidia proposed a specification for an 8-bit floating point (FP8) format that could provide a common interchangeable format that works for both AI training and inference and allow AI models to operate and perform consistently across hardware platforms. Find the technical paper titled " FP8 Formats For Deep Learning" here. Published Sept 2022. Abstract: "FP8 is a natural p... » read more

Convolutional Neural Networks: Co-Design of Hardware Architecture and Compression Algorithm


Researchers at Soongsil University (Korea) published "A Survey on Efficient Convolutional Neural Networks and Hardware Acceleration." Abstract: "Over the past decade, deep-learning-based representations have demonstrated remarkable performance in academia and industry. The learning capability of convolutional neural networks (CNNs) originates from a combination of various feature extraction... » read more

Deep Learning To Classify And Establish Structure Property Predictions With PeakForce QNM Atomic Force Microscopy


Machine learning and specifically, deep learning, is a powerful tool to establish the presence (or absence) of microstructure correlations to bulk properties with its ability to flesh out relationships and trends that are difficult to establish otherwise. This application note discusses the use of deep learning tools, to explore AFM phase and PeakForce Quantitative Nanomechanics (QNM) im... » read more

DNN-Opt, A Novel Deep Neural Network (DNN) Based Black-Box Optimization Framework For Analog Sizing


This technical paper titled "DNN-Opt: An RL Inspired Optimization for Analog Circuit Sizing using Deep Neural Networks" is co-authored from researchers at The University of Texas at Austin, Intel, University of Glasgow. The paper was a best paper candidate at DAC 2021. "In this paper, we present DNN-Opt, a novel Deep Neural Network (DNN) based black-box optimization framework for analog sizi... » read more

New Uses For AI In Chips


Artificial intelligence is being deployed across a number of new applications, from improving performance and reducing power in a wide range of end devices to spotting irregularities in data movement for security reasons. While most people are familiar with using machine learning and deep learning to distinguish between cats and dogs, emerging applications show how this capability can be use... » read more

Low Power HW Accelerator for FP16 Matrix Multiplications For Tight Integration Within RISC-V Cores


This new technical paper titled "RedMulE: A Compact FP16 Matrix-Multiplication Accelerator for Adaptive Deep Learning on RISC-V-Based Ultra-Low-Power SoCs" was published by researchers at University of Bologna and ETH Zurich. According to their abstract: "One of the key stumbling stones is the need for parallel floating-point operations, which are considered unaffordable on sub-100 mW extre... » read more

Analog Deep Learning Processor (MIT)


A team of researchers at MIT are working on hardware for artificial intelligence that offers faster computing with less power. The analog deep learning technique involves sending protons through solids at extremely fast speeds.  “The working mechanism of the device is electrochemical insertion of the smallest ion, the proton, into an insulating oxide to modulate its electronic conductivity... » read more

← Older posts Newer posts →