Lower Energy, High Performance LLM on FPGA Without Matrix Multiplication


A new technical paper titled "Scalable MatMul-free Language Modeling" was published by UC Santa Cruz, Soochow University, UC Davis, and LuxiTech. Abstract "Matrix multiplication (MatMul) typically dominates the overall computational cost of large language models (LLMs). This cost only grows as LLMs scale to larger embedding dimensions and context lengths. In this work, we show that MatMul... » read more

Improving Image Resolution At The Edge


How much cameras see depends on how accurately the images are rendered and classified. The higher the resolution, the greater the accuracy. But higher resolution also requires significantly more computation, and it requires flexibility in the design to be able to adapt to new algorithms and network models. Jeremy Roberson, technical director and software architect for AI/ML at Flex Logix, talks... » read more

More Efficient Matrix-Multiplication Algorithms with Reinforcement Learning (DeepMind)


A new research paper titled "Discovering faster matrix multiplication algorithms with reinforcement learning" was published by researchers at DeepMind. "Here we report a deep reinforcement learning approach based on AlphaZero for discovering efficient and provably correct algorithms for the multiplication of arbitrary matrices," states the paper. Find the technical paper link here. Publis... » read more

Is Programmable Overhead Worth The Cost?


Programmability has fueled the growth of most semiconductor products, but how much does it actually cost? And is that cost worth it? The answer is more complicated than a simple efficiency formula. It can vary by application, by maturity of technology in a particular market, and in the context of much larger systems. What's considered important for one design may be very different for anothe... » read more

Maximizing Edge AI Performance


Inference of convolutional neural network models is algorithmically straightforward, but to get the fastest performance for your application there are a few pitfalls to keep in mind when deploying. A number of factors make efficient inference difficult, which we will first step through before diving into specific solutions to address and resolve each. By the end of this article, you will be arm... » read more

In-Memory Computing


Gideon Intrater, CTO at Adesto Technologies, talks about why in-memory computing is now being taken seriously again, years after it was first proposed as a possible option. What's changed is an explosion in data, and a recognition that it's too time- and energy-intensive to send all of that data back and forth between memories and processors on the same chip, let alone to the cloud and back. On... » read more

Speeding Up AI


Robert Blake, president and CEO of Achronix, sat down with Semiconductor Engineering to talk about AI, which processors work best where, and different approaches to accelerate performance. SE: How is AI affecting the FPGA business, given the constant changes in algorithms and the proliferation of AI almost everywhere? Blake: As we talk to more and more customers deploying new products and... » read more

Improving Edge Inferencing


Cheng Wang, senior vice president of engineering at Flex Logix, talks with Semiconductor Engineering about how to improve the efficiency and speed of edge inferencing chips, what causes bottlenecks, and why AI chips are different from other types of semiconductors. » read more

The Automation Of AI


Semiconductor Engineering sat down to discuss the role that EDA has in automating artificial intelligence and machine learning with Doug Letcher, president and CEO of Metrics; Daniel Hansson, CEO of Verifyter; Harry Foster, chief scientist verification for Mentor, a Siemens Business; Larry Melling, product management director for Cadence; Manish Pandey, Synopsys fellow; and Raik Brinkmann, CEO ... » read more

The DNA of an Artificial Intelligence SoC


Over the past decade, a few advancements have made artificial intelligence (AI) one of the most exciting technologies of our lifetime. In 2012, Geoffrey Everest Hinton demonstrated his generalized back propagation neural network algorithm in the Imagenet challenge, which revolutionized the field of computer vision. However, the math was developed years prior to 2012, and it was the available m... » read more

← Older posts