To (B)atch Or Not To (B)atch?

By Steve Roddy - 18 Nov, 2024 - Comments: 0

When evaluating benchmark results for AI/ML processing solutions, it is very helpful to remember Shakespeare’s Hamlet, and the famous line: “To be, or not to be.” Except in this case the “B” stands for Batched. Batch size matters There are two different ways in which a machine learning inference workload can be used in a system. A particular ML graph can be used one time, preced... » read more

GDDR7 Memory Supercharges AI Inference

By Tim Messegee - 17 Oct, 2024 - Comments: 0

GDDR7 is the state-of-the-art graphics memory solution with a performance roadmap of up to 48 Gigatransfers per second (GT/s) and memory throughput of 192 GB/s per GDDR7 memory device. The next generation of GPUs and accelerators for AI inference will use GDDR7 memory to provide the memory bandwidth needed for these demanding workloads. AI is two applications: training and inference. With tr... » read more

GDDR7: The Ideal Memory Solution In AI Inference

By Frank Ferro - 29 Aug, 2024 - Comments: 1

The generative AI market is experiencing rapid growth, driven by the increasing parameter size of Large Language Models (LLMs). This growth is pushing the boundaries of performance requirements for training hardware within data centers. For an in-depth look at this, consider the insights provided in "HBM3E: All About Bandwidth". Once trained, these models are deployed across a diverse range of... » read more

Dedicated Approximate Computing Framework To Efficiently Compute PCs On Hardware

By Technical Paper Link - 20 Jun, 2024 - Comments: 0

A technical paper titled “On Hardware-efficient Inference in Probabilistic Circuits” was published by researchers at Aalto University and UCLouvain. Abstract: "Probabilistic circuits (PCs) offer a promising avenue to perform embedded reasoning under uncertainty. They support efficient and exact computation of various probabilistic inference tasks by design. Hence, hardware-efficient compu... » read more

The Implications Of AI Everywhere: From Data Center To Edge

By Emma-Jane Crozier - 14 Mar, 2024 - Comments: 0

Generative AI has upped the ante on the transformative force of AI, driving profound implications across all aspects of our everyday lives. Over the past year, we have seen AI capabilities placed firmly in the hands of consumers. The recent news and product announcements emerging from MWC 2024 highlighted what we can expect to see from the next wave of generative AI applications. AI will be eve... » read more

Why A DSP Is Indispensable In The New World of AI

By Cadence - 25 Oct, 2023 - Comments: 0

Chips being designed today for the automotive, mobile handset, AI-IoT (artificial intelligence - Internet of things), and other AI applications will be fabricated in a year or two, designed into end products that will hit the market in three or more years, and then have a product lifecycle of at least five years. These chips will be used in systems with a large number and various types of senso... » read more

IBM’s Energy-Efficient NorthPole AI Unit

By Barry Pangrle - 28 Sep, 2023 - Comments: 0

At this point it is well known that from an energy efficiency standpoint, the biggest bang for the back is to be found at the highest levels of abstraction. Fitting the right architecture to the task at hand i.e., an application specific architecture, will lead to benefits that are hard or impossible to claw back later in the design and implementation flow. With the huge increase in the inter... » read more

Your AI Chip Doesn’t Need An Expensive Insurance Policy

By Steve Roddy - 14 Sep, 2023 - Comments: 0

Imagine you are an architect designing a new SoC for an application that needs substantial machine learning inferencing horsepower. The team in marketing has given you a list of ML workloads and performance specs that you need to hit. The in-house designed NPU accelerator works well for these known workloads – things like MobileNet v2 and Resnet50. The accelerator speeds up 95+% of the comput... » read more

A Packet-Based Architecture For Edge AI Inference

By Pat Donnelly - 27 Jul, 2023 - Comments: 0

Despite significant improvements in throughput, edge AI accelerators (Neural Processing Units, or NPUs) are still often underutilized. Inefficient management of weights and activations leads to fewer available cores utilized for multiply-accumulate (MAC) operations. Edge AI applications frequently need to run on small, low-power devices, limiting the area and power allocated for memory and comp... » read more

A Bridge From Mars To Venus

By Steve Roddy - 20 Jul, 2023 - Comments: 0

In a now-famous 1992 pop psychology book titled "Men Are from Mars, Women Are from Venus," author John Gray posited that most relationship troubles in couples stem from fundamental differences in socialization patterns between men and women. The analogy that the two partners came from different planets was used to describe how two people could perceive issues in completely different and sometim... » read more

← Older posts

Knowledge Centers
Entities, people and technologies explored

Shift Left Is The Tip Of The Iceberg

A transformative change is underway for semiconductor design and EDA. New languages, models, and abstractions will need to be created.

by Brian Bailey

Memory Fundamentals For Engineers

eBook: Nearly everything you need to know about memory, including detailed explanations of the different types of memory; how and where these are used today; what's changing, which memories are successful and which ones might be in the future; and the limitations of each memory type.

by The SE Staff

tag: inference

To (B)atch Or Not To (B)atch?

GDDR7 Memory Supercharges AI Inference

GDDR7: The Ideal Memory Solution In AI Inference

Dedicated Approximate Computing Framework To Efficiently Compute PCs On Hardware

The Implications Of AI Everywhere: From Data Center To Edge

Why A DSP Is Indispensable In The New World of AI

IBM’s Energy-Efficient NorthPole AI Unit

Your AI Chip Doesn’t Need An Expensive Insurance Policy

A Packet-Based Architecture For Edge AI Inference

A Bridge From Mars To Venus

Trending Articles

Shift Left Is The Tip Of The Iceberg

NAND Flash Targets 1,000 Layers

One Chip Vs. Many Chiplets

FOPLP Gains Traction in Advanced Semiconductor Packaging

HBM Options Increase As AI Demand Soars

Knowledge Centers
Entities, people and technologies explored

Related Articles

Shift Left Is The Tip Of The Iceberg

Memory Fundamentals For Engineers

GDDR7 Memory Supercharges AI Inference

Is PPA Relevant Today?

CXL Thriving As Memory Link

Higher Density, More Data Create New Bottlenecks In AI Chips

Managing The Huge Power Demands Of AI Everywhere

Big Changes Ahead For Analog Design

Sponsors

Recent Comments

About

Navigation

Connect With Us

tag: inference

To (B)atch Or Not To (B)atch?

GDDR7 Memory Supercharges AI Inference

GDDR7: The Ideal Memory Solution In AI Inference

Dedicated Approximate Computing Framework To Efficiently Compute PCs On Hardware

The Implications Of AI Everywhere: From Data Center To Edge

Why A DSP Is Indispensable In The New World of AI

IBM’s Energy-Efficient NorthPole AI Unit

Your AI Chip Doesn’t Need An Expensive Insurance Policy

A Packet-Based Architecture For Edge AI Inference

A Bridge From Mars To Venus

Trending Articles

Shift Left Is The Tip Of The Iceberg

NAND Flash Targets 1,000 Layers

One Chip Vs. Many Chiplets

FOPLP Gains Traction in Advanced Semiconductor Packaging

HBM Options Increase As AI Demand Soars

Knowledge Centers Entities, people and technologies explored

Related Articles

Shift Left Is The Tip Of The Iceberg

Memory Fundamentals For Engineers

GDDR7 Memory Supercharges AI Inference

Is PPA Relevant Today?

CXL Thriving As Memory Link

Higher Density, More Data Create New Bottlenecks In AI Chips

Managing The Huge Power Demands Of AI Everywhere

Big Changes Ahead For Analog Design

Sponsors

Newsletter Signup

Popular Tags

Recent Comments

About

Navigation

Connect With Us

Knowledge Centers
Entities, people and technologies explored