Implementing AI Activation Functions

By Bryon Moyer - 10 Apr, 2025 - Comments: 1

Activation functions play a critical role in AI inference, helping to ferret out nonlinear behaviors in AI models. This makes them an integral part of any neural network, but nonlinear functions can be fussy to build in silicon. Is it better to have a CPU calculate them? Should hardware function units be laid down to execute them? Or would a lookup table (LUT) suffice? Most architectures inc... » read more

HW-Aligned Sparse Attention Architecture For Efficient Long-Context Modeling (DeepSeek et al.)

By Technical Paper Link - 18 Feb, 2025 - Comments: 0

A new technical paper titled "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention" was published by DeepSeek, Peking University and University of Washington. Abstract "Long-context modeling is crucial for next-generation language models, yet the high computational cost of standard attention mechanisms poses significant computational challenges. Sparse attention... » read more

Normalization Keeps AI Numbers In Check

By Bryon Moyer - 13 Feb, 2025 - Comments: 0

AI training and inference are all about running data through models — typically to make some kind of decision. But the paths that the calculations take aren’t always straightforward, and as a model processes its inputs, those calculations may go astray. Normalization is a process that can keep data in bounds, improving both training and inference. Foregoing normalization can result in at... » read more

New AI Data Types Emerge

By Bryon Moyer - 14 Nov, 2024 - Comments: 0

AI is all about data, and the representation of the data matters strongly. But after focusing primarily on 8-bit integers and 32‑bit floating-point numbers, the industry is now looking at new formats. There is no single best type for every situation, because the choice depends on the type of AI model, whether accuracy, performance, or power is prioritized, and where the computing happens, ... » read more

Mass Customization For AI Inference

By Ann Mutschler - 17 Oct, 2024 - Comments: 0

Rising complexity in AI models and an explosion in the number and variety of networks is leaving chipmakers torn between fixed-function acceleration and more programmable accelerators, and creating some novel approaches that include some of both. By all accounts, a general-purpose approach to AI processing is not meeting the grade. General-purpose processors are exactly that. They're not des... » read more

Novel NorthPole Architecture Enables Low-Latency, High-Energy-Efficiency LLM inference (IBM Research)

By Technical Paper Link - 27 Sep, 2024 - Comments: 0

A new technical paper titled "Breakthrough low-latency, high-energy-efficiency LLM inference performance using NorthPole" was published by researchers at IBM Research. At the IEEE High Performance Extreme Computing (HPEC) Virtual Conference in September 2024, new performance results for their AIU NorthPole AI inference accelerator chip were presented on a 3-billion-parameter Granite LLM. ... » read more

Supercharging AI Inference With GDDR7

By Rambus - 08 May, 2024 - Comments: 0

A rapid rise in the size and sophistication of AI inference models requires increasingly powerful AI accelerators and GPUs deployed in edge servers and client PCs. GDDR7 memory offers an attractive combination of bandwidth, capacity, latency and power for these accelerators and processors. The Rambus GDDR7 Memory Controller IP offers industry leading GDDR7 performance of up to 40 Gbps and 160 G... » read more

Why A DSP Is Indispensable In The New World of AI

By Cadence - 25 Oct, 2023 - Comments: 0

Chips being designed today for the automotive, mobile handset, AI-IoT (artificial intelligence - Internet of things), and other AI applications will be fabricated in a year or two, designed into end products that will hit the market in three or more years, and then have a product lifecycle of at least five years. These chips will be used in systems with a large number and various types of senso... » read more

tag: AI inference

Implementing AI Activation Functions

HW-Aligned Sparse Attention Architecture For Efficient Long-Context Modeling (DeepSeek et al.)

Normalization Keeps AI Numbers In Check

New AI Data Types Emerge

Mass Customization For AI Inference

Novel NorthPole Architecture Enables Low-Latency, High-Energy-Efficiency LLM inference (IBM Research)

Supercharging AI Inference With GDDR7

Why A DSP Is Indispensable In The New World of AI

Trending Articles

Advanced Packaging Fundamentals for Semiconductor Engineers

Chiplet Tradeoffs And Limitations

2030 Data Center AI Chip Winners: The Trillion Dollar Club

AI Agents Need Goals

Chip Industry Week in Review

Knowledge Centers
Entities, people and technologies explored

Related Articles

2025: So Many Possibilities

Signal Integrity Plays Increasingly Critical Role In Chiplet Design

3D-IC For The Masses

Chiplets Add New Power Issues

Chiplet Tradeoffs And Limitations

Chiplets: Where Are We Today?