Techniques For Improving Energy Efficiency of Training/Inference for NLP Applications, Including Power Capping & Energy-Aware Scheduling


This new technical paper titled "Great Power, Great Responsibility: Recommendations for Reducing Energy for Training Language Models" is from researchers at MIT and Northeastern University. Abstract: "The energy requirements of current natural language processing models continue to grow at a rapid, unsustainable pace. Recent works highlighting this problem conclude there is an urgent need ... » read more

Can Analog Make A Comeback?


We live in an analog world dominated by digital processing, but that could change. Domain specificity, and the desire for greater levels of optimization, may provide analog compute with some significant advantages — and the possibility of a comeback. For the last four decades, the advantages of digital scaling and flexibility have pushed the dividing line between analog and digital closer ... » read more

Transforming AI Models For Accelerator Chips


AI is all about speeding up the movement and processing of data. Ali Cheraghi, solution architect at Flex Logix, talks about why floating point data needs to be converted into integer point data, how that impacts power and performance, and how different approaches in quantization play into this formula. » read more

Expedera: Custom Deep Learning Accelerators Through Soft-IP


Internet of Things (IoT) and Artificial Intelligence (AI) have caused a massive increase in data generation — and along with it, a need to process data faster and more efficiently. Dubbed a “tsunami of data,” data centers are expected to consume about one-fifth of worldwide energy before 2030. This data explosion is driving a wave of startups looking to gain a foothold in custom accele... » read more

How Inferencing Differs From Training in Machine Learning Applications


Machine learning (ML)-based approaches to system development employ a fundamentally different style of programming than historically used in computer science. This approach uses example data to train a model to enable the machine to learn how to perform a task. ML training is highly iterative with each new piece of training data generating trillions of operations. The iterative nature of the tr... » read more

Using ML In EDA


Machine learning is becoming essential for designing chips due to the growing volume of data stemming from increasing density and complexity. Nick Ni, director of product marketing for AI at Xilinx, examines why machine learning is gaining traction at advanced nodes, where it’s being used today and how it will be used in the future, how quality of results compare with and without ML, and what... » read more

How Dynamic Hardware Efficiently Solves The Neural Network Complexity Problem


Given the high computational requirements of neural network models, efficient execution is paramount. When performed trillions of times per second even the tiniest inefficiencies are multiplied into large inefficiencies at the chip and system level. Because AI models continue to expand in complexity and size as they are asked to become more human-like in their (artificial) intelligence, it is c... » read more

Safe And Robust Machine Learning


Deploying machine learning in the real world is a lot different than developing and testing it in a lab. Quenton Hall, AI systems architect at Xilinx, examines security implications on both the inferencing and training side, the potential for disruptions to accuracy, and how accessible these models and algorithms will be when they are used at the edge and in the cloud. This involves everything ... » read more

Shifting Toward Data-Driven Chip Architectures


An explosion in data is forcing chipmakers to rethink where to process data, which are the best types of processors and memories for different types of data, and how to structure, partition and prioritize the movement of raw and processed data. New chips from systems companies such as Google, Facebook, Alibaba, and IBM all incorporate this approach. So do those developed by vendors like Appl... » read more

Accelerating AI/ML Inferencing With GDDR6 DRAM


The origins of graphics double data rate (GDDR) memory can be traced to the rise of 3D gaming on PCs and consoles. The first graphics processing units (GPU) packed single data rate (SDR) and double data rate (DDR) DRAM – the same solution used for CPU main memory. As gaming evolved, the demand for higher frame rates at ever higher resolutions drove the need for a graphics-workload specific me... » read more

← Older posts Newer posts →