The Coming NPU Population Collapse

By Steve Roddy - 12 Jun, 2025 - Comments: 0

At some point in everyone’s teenage years of schooling we were all taught in a nature or biology class about cycles of population surges and then inevitable population collapses. Whether the example was an animal, plant, insect or even bacteria, some external event triggers a rapid surge in the population of a species which leads to overpopulation and competition for resources (food, space, s... » read more

The Rise Of Generative AI On The Edge

By Gordon Cooper - 03 Apr, 2025 - Comments: 0

Artificial intelligence (AI) and machine learning (ML) have undergone significant transformations over the past decade. The revolution of convolutional neural networks (CNNs) and recurrent neural networks (RNNs) is evolving toward the adoption of transformers and generative AI (GenAI), marking a pivotal shift in the field. This transition is driven by the need for more accurate, efficient, and ... » read more

To (B)atch Or Not To (B)atch?

By Steve Roddy - 18 Nov, 2024 - Comments: 0

When evaluating benchmark results for AI/ML processing solutions, it is very helpful to remember Shakespeare’s Hamlet, and the famous line: “To be, or not to be.” Except in this case the “B” stands for Batched. Batch size matters There are two different ways in which a machine learning inference workload can be used in a system. A particular ML graph can be used one time, preced... » read more

Embrace The New!

By Steve Roddy - 14 Mar, 2024 - Comments: 0

The ResNet family of machine learning algorithms was introduced to the AI world in 2015. A slew of variations was rapidly discovered that at the time pushed the accuracy of ResNets close to the 80% threshold (78.57% Top 1 accuracy for ResNet-152 on ImageNet). This state-of-the-art performance at the time, coupled with the rather simple operator structure that was readily amenable to hardware ac... » read more

BYO NPU Benchmarks

By Steve Roddy - 14 Dec, 2023 - Comments: 0

In our last blog post, we highlighted the ways that NPU vendors can shade the truth about performance on benchmark networks such that comparing common performance scores such as “Resnet50 Inferences / Second” can be a futile exercise. But there is a straight-forward, low-investment method for an IP evaluator to short-circuit all the vendor shenanigans and get a solid apples-to-apples result... » read more

Considerations For Accelerating On-Device Stable Diffusion Models

By Pat Donnelly - 30 Nov, 2023 - Comments: 0

One of the more powerful – and visually stunning – advances in generative AI has been the development of Stable Diffusion models. These models are used for image generation, image denoising, inpainting (reconstructing missing regions in an image), outpainting (generating new pixels that seamlessly extend an image's existing bounds), and bit diffusion. Stable Diffusion uses a type of dif... » read more

Does Your NPU Vendor Cheat On Benchmarks?

By Steve Roddy - 12 Oct, 2023 - Comments: 0

It is common industry practice for companies seeking to purchase semiconductor IP to begin the search by sending prospective vendors a list of questions, typically called an RFI (Request for Information) or simply a Vendor Spreadsheet. These spreadsheets contain a wide gamut of requested information ranging from background on the vendor’s financial status, leadership team, IP design practices... » read more

Your AI Chip Doesn’t Need An Expensive Insurance Policy

By Steve Roddy - 14 Sep, 2023 - Comments: 0

Imagine you are an architect designing a new SoC for an application that needs substantial machine learning inferencing horsepower. The team in marketing has given you a list of ML workloads and performance specs that you need to hit. The in-house designed NPU accelerator works well for these known workloads – things like MobileNet v2 and Resnet50. The accelerator speeds up 95+% of the comput... » read more

Compiler-Driven Performance Boosts For GPNPUs

By Steve Roddy - 10 Aug, 2023 - Comments: 0

The GNU C Compiler – GCC – was first released in 1987. 36 years ago. Several version streams are still actively being developed and enhanced, with GCC13 being the most advanced, and a GCC v10.5 released in early July this year. You might think that with 36 years of refinement by thousands of contributors that penultimate performance has been achieved. All that could be discovered has bee... » read more

A Packet-Based Architecture For Edge AI Inference

By Pat Donnelly - 27 Jul, 2023 - Comments: 0

Despite significant improvements in throughput, edge AI accelerators (Neural Processing Units, or NPUs) are still often underutilized. Inefficient management of weights and activations leads to fewer available cores utilized for multiply-accumulate (MAC) operations. Edge AI applications frequently need to run on small, low-power devices, limiting the area and power allocated for memory and comp... » read more

← Older posts

tag: neural processing unit

The Coming NPU Population Collapse

The Rise Of Generative AI On The Edge

To (B)atch Or Not To (B)atch?

Embrace The New!

BYO NPU Benchmarks

Considerations For Accelerating On-Device Stable Diffusion Models

Does Your NPU Vendor Cheat On Benchmarks?

Your AI Chip Doesn’t Need An Expensive Insurance Policy

Compiler-Driven Performance Boosts For GPNPUs

A Packet-Based Architecture For Edge AI Inference

Trending Articles

RISC-V’s Increasing Influence

Chip Industry Week in Review

Power Delivery Challenges For AI Chips

TSMC: King Of Data Center AI

Novel Assembly Approaches For 3D Device Stacks

Knowledge Centers
Entities, people and technologies explored

Related Articles

RISC-V’s Increasing Influence

Development Flows For Chiplets

New Data Center Protocols Tackle AI

Chiplet Tradeoffs And Limitations

Implementing AI Activation Functions

Die-to-die Interconnect Standards In Flux

The Best DRAMs For Artificial Intelligence

Future-proofing AI Models

Sponsors

Recent Comments

About

Navigation

Connect With Us

tag: neural processing unit

The Coming NPU Population Collapse

The Rise Of Generative AI On The Edge

To (B)atch Or Not To (B)atch?

Embrace The New!

BYO NPU Benchmarks

Considerations For Accelerating On-Device Stable Diffusion Models

Does Your NPU Vendor Cheat On Benchmarks?

Your AI Chip Doesn’t Need An Expensive Insurance Policy

Compiler-Driven Performance Boosts For GPNPUs

A Packet-Based Architecture For Edge AI Inference

Trending Articles

RISC-V’s Increasing Influence

Chip Industry Week in Review

Power Delivery Challenges For AI Chips

TSMC: King Of Data Center AI

Novel Assembly Approaches For 3D Device Stacks

Knowledge Centers Entities, people and technologies explored

Related Articles

RISC-V’s Increasing Influence

Development Flows For Chiplets

New Data Center Protocols Tackle AI

Chiplet Tradeoffs And Limitations

Implementing AI Activation Functions

Die-to-die Interconnect Standards In Flux

The Best DRAMs For Artificial Intelligence

Future-proofing AI Models

Sponsors

Newsletter Signup

Popular Tags

Recent Comments

About

Navigation

Connect With Us

Knowledge Centers
Entities, people and technologies explored