AI Bubble Or Boom?

By Geoff Tate - 20 Oct, 2025 - Comments: 1

Are we in an AI bubble? Parallels are being drawn to the dot.com boom/bust of 1999-2000. In the dot.com bust, many high-tech companies valuations soared up 10X, then deflated. The peak P/E ratio for the Nasdaq Composite was 200! Remember Webvan? It went public November 1999 with an $8 billion valuation, then filed for bankruptcy 19 months later. It was much speculation without profits or gro... » read more

GDDR7 Tackles Massive-Context AI Inference

By Nidish Kamath - 16 Oct, 2025 - Comments: 0

The AI hardware landscape is evolving at breakneck speed, and memory technology is at the heart of this transformation. NVIDIA’s recent announcement of Rubin CPX, a new class of GPU purpose-built for massive-context inference, underscores this trend. Rubin CPX is designed to tackle workloads that require reasoning across millions of tokens. Use cases include long-form generative video, comple... » read more

Re-Architecting AI For Power

By Brian Bailey - 14 Aug, 2025 - Comments: 0

The industry is becoming increasingly concerned about the amount of power being consumed by AI, but there is no simple solution to the problem. It requires a deep understanding of the application, the software and hardware architectures at both the semiconductor and system levels, and how all of this is designed and implemented. Each piece plays a role in the total power consumed and the utilit... » read more

Transformers At The Edge: Efficient LLM Deployment

By Paul Karazuba - 17 Jul, 2025 - Comments: 0

Since the groundbreaking 2017 publication of “Attention Is All You Need,” the transformer architecture has fundamentally reshaped artificial intelligence research and development. This innovation laid the foundation for Large Language Models (LLMs) and Video Language Models (VLMs), fueling a wave of productization across the industry. A defining milestone was the public launch of ChatGPT in... » read more

Implementing AI Activation Functions

By Bryon Moyer - 10 Apr, 2025 - Comments: 1

Activation functions play a critical role in AI inference, helping to ferret out nonlinear behaviors in AI models. This makes them an integral part of any neural network, but nonlinear functions can be fussy to build in silicon. Is it better to have a CPU calculate them? Should hardware function units be laid down to execute them? Or would a lookup table (LUT) suffice? Most architectures inc... » read more

Normalization Keeps AI Numbers In Check

By Bryon Moyer - 13 Feb, 2025 - Comments: 0

AI training and inference are all about running data through models — typically to make some kind of decision. But the paths that the calculations take aren’t always straightforward, and as a model processes its inputs, those calculations may go astray. Normalization is a process that can keep data in bounds, improving both training and inference. Foregoing normalization can result in at... » read more

To (B)atch Or Not To (B)atch?

By Steve Roddy - 18 Nov, 2024 - Comments: 0

When evaluating benchmark results for AI/ML processing solutions, it is very helpful to remember Shakespeare’s Hamlet, and the famous line: “To be, or not to be.” Except in this case the “B” stands for Batched. Batch size matters There are two different ways in which a machine learning inference workload can be used in a system. A particular ML graph can be used one time, preced... » read more

GDDR7 Memory Supercharges AI Inference

By Tim Messegee - 17 Oct, 2024 - Comments: 0

GDDR7 is the state-of-the-art graphics memory solution with a performance roadmap of up to 48 Gigatransfers per second (GT/s) and memory throughput of 192 GB/s per GDDR7 memory device. The next generation of GPUs and accelerators for AI inference will use GDDR7 memory to provide the memory bandwidth needed for these demanding workloads. AI is two applications: training and inference. With tr... » read more

GDDR7: The Ideal Memory Solution In AI Inference

By Frank Ferro - 29 Aug, 2024 - Comments: 1

The generative AI market is experiencing rapid growth, driven by the increasing parameter size of Large Language Models (LLMs). This growth is pushing the boundaries of performance requirements for training hardware within data centers. For an in-depth look at this, consider the insights provided in "HBM3E: All About Bandwidth". Once trained, these models are deployed across a diverse range of... » read more

Dedicated Approximate Computing Framework To Efficiently Compute PCs On Hardware

By Technical Paper Link - 20 Jun, 2024 - Comments: 0

A technical paper titled “On Hardware-efficient Inference in Probabilistic Circuits” was published by researchers at Aalto University and UCLouvain. Abstract: "Probabilistic circuits (PCs) offer a promising avenue to perform embedded reasoning under uncertainty. They support efficient and exact computation of various probabilistic inference tasks by design. Hence, hardware-efficient compu... » read more

← Older posts Newer posts →

tag: inference

AI Bubble Or Boom?

GDDR7 Tackles Massive-Context AI Inference

Re-Architecting AI For Power

Transformers At The Edge: Efficient LLM Deployment

Implementing AI Activation Functions

Normalization Keeps AI Numbers In Check

To (B)atch Or Not To (B)atch?

GDDR7 Memory Supercharges AI Inference

GDDR7: The Ideal Memory Solution In AI Inference

Dedicated Approximate Computing Framework To Efficiently Compute PCs On Hardware

Trending Articles

Chip Industry Week In Review

Chip Industry Week In Review

Executive Outlook: Agentic AI’s Impact On Chip Design

I/O Design Challenges Grow In AI Data Centers And HPC Clusters

Data Center AI Growth Faces Challenging Bottlenecks

Knowledge Centers
Entities, people and technologies explored

Related Articles

Startup Funding: Q1 2026

All AI Data Center Interconnects Will Be Optical Within 5 Years

The Sub-2nm Paradox

When Semiconductor Materials Misbehave

TSMC Tech Symposium 2026, By The Numbers

Silicon Photonics Lights The Way To More Efficient Data Centers

Memory Wall Gets Higher

TSV Complexity Leads To Manufacturing Bottleneck

Sponsors

Recent Comments

About

Navigation

Connect With Us

tag: inference

AI Bubble Or Boom?

GDDR7 Tackles Massive-Context AI Inference

Re-Architecting AI For Power

Transformers At The Edge: Efficient LLM Deployment

Implementing AI Activation Functions

Normalization Keeps AI Numbers In Check

To (B)atch Or Not To (B)atch?

GDDR7 Memory Supercharges AI Inference

GDDR7: The Ideal Memory Solution In AI Inference

Dedicated Approximate Computing Framework To Efficiently Compute PCs On Hardware

Trending Articles

Chip Industry Week In Review

Chip Industry Week In Review

Executive Outlook: Agentic AI’s Impact On Chip Design

I/O Design Challenges Grow In AI Data Centers And HPC Clusters

Data Center AI Growth Faces Challenging Bottlenecks

Knowledge Centers Entities, people and technologies explored

Related Articles

Startup Funding: Q1 2026

All AI Data Center Interconnects Will Be Optical Within 5 Years

The Sub-2nm Paradox

When Semiconductor Materials Misbehave

TSMC Tech Symposium 2026, By The Numbers

Silicon Photonics Lights The Way To More Efficient Data Centers

Memory Wall Gets Higher

TSV Complexity Leads To Manufacturing Bottleneck

Sponsors

Newsletter Signup

Popular Tags

Recent Comments

About

Navigation

Connect With Us

Knowledge Centers
Entities, people and technologies explored