Re-Architecting AI For Power


The industry is becoming increasingly concerned about the amount of power being consumed by AI, but there is no simple solution to the problem. It requires a deep understanding of the application, the software and hardware architectures at both the semiconductor and system levels, and how all of this is designed and implemented. Each piece plays a role in the total power consumed and the utilit... » read more

Transformers At The Edge: Efficient LLM Deployment


Since the groundbreaking 2017 publication of “Attention Is All You Need,” the transformer architecture has fundamentally reshaped artificial intelligence research and development. This innovation laid the foundation for Large Language Models (LLMs) and Video Language Models (VLMs), fueling a wave of productization across the industry. A defining milestone was the public launch of ChatGPT in... » read more

Implementing AI Activation Functions


Activation functions play a critical role in AI inference, helping to ferret out nonlinear behaviors in AI models. This makes them an integral part of any neural network, but nonlinear functions can be fussy to build in silicon. Is it better to have a CPU calculate them? Should hardware function units be laid down to execute them? Or would a lookup table (LUT) suffice? Most architectures inc... » read more

Normalization Keeps AI Numbers In Check


AI training and inference are all about running data through models — typically to make some kind of decision. But the paths that the calculations take aren’t always straightforward, and as a model processes its inputs, those calculations may go astray. Normalization is a process that can keep data in bounds, improving both training and inference. Foregoing normalization can result in at... » read more

To (B)atch Or Not To (B)atch?


When evaluating benchmark results for AI/ML processing solutions, it is very helpful to remember Shakespeare’s Hamlet, and the famous line: “To be, or not to be.” Except in this case the “B” stands for Batched. Batch size matters There are two different ways in which a machine learning inference workload can be used in a system. A particular ML graph can be used one time, preced... » read more

GDDR7 Memory Supercharges AI Inference


GDDR7 is the state-of-the-art graphics memory solution with a performance roadmap of up to 48 Gigatransfers per second (GT/s) and memory throughput of 192 GB/s per GDDR7 memory device. The next generation of GPUs and accelerators for AI inference will use GDDR7 memory to provide the memory bandwidth needed for these demanding workloads. AI is two applications: training and inference. With tr... » read more

GDDR7: The Ideal Memory Solution In AI Inference


The generative AI market is experiencing rapid growth, driven by the increasing parameter size of Large Language Models (LLMs). This growth is pushing the boundaries of performance requirements for training hardware within data centers. For an in-depth look at this, consider the insights provided in "HBM3E: All About Bandwidth". Once trained, these models are deployed across a diverse range of... » read more

Dedicated Approximate Computing Framework To Efficiently Compute PCs On Hardware


A technical paper titled “On Hardware-efficient Inference in Probabilistic Circuits” was published by researchers at Aalto University and UCLouvain. Abstract: "Probabilistic circuits (PCs) offer a promising avenue to perform embedded reasoning under uncertainty. They support efficient and exact computation of various probabilistic inference tasks by design. Hence, hardware-efficient compu... » read more

The Implications Of AI Everywhere: From Data Center To Edge


Generative AI has upped the ante on the transformative force of AI, driving profound implications across all aspects of our everyday lives. Over the past year, we have seen AI capabilities placed firmly in the hands of consumers. The recent news and product announcements emerging from MWC 2024 highlighted what we can expect to see from the next wave of generative AI applications. AI will be eve... » read more

Why A DSP Is Indispensable In The New World of AI


Chips being designed today for the automotive, mobile handset, AI-IoT (artificial intelligence - Internet of things), and other AI applications will be fabricated in a year or two, designed into end products that will hit the market in three or more years, and then have a product lifecycle of at least five years. These chips will be used in systems with a large number and various types of senso... » read more

← Older posts Newer posts →