NPU Acceleration For Multimodal LLMs

By Pat Donnelly - 12 Dec, 2024 - Comments: 0

Transformer-based models have rapidly spread from text to speech, vision, and other modalities. This has created challenges for the development of Neural Processing Units (NPUs). NPUs must now efficiently support the computation of weights and propagation of activations through a series of attention blocks. Increasingly, NPUs must be able to process models with multiple input modalities with ac... » read more

Small Language Models: A Solution To Language Model Deployment At The Edge?

By Paul Karazuba - 18 Nov, 2024 - Comments: 0

While Large Language Models (LLMs) like GPT-3 and GPT-4 have quickly become synonymous with AI, LLM mass deployments in both training and inference applications have, to date, been predominately cloud-based. This is primarily due to the sheer size of the models; the resulting processing and memory requirements often overwhelm the capabilities of edge-based systems. While the efficiency of Exped... » read more

Vision Is Why LLMs Matter On The Edge

By Ben Gomes - 30 May, 2024 - Comments: 0

Large Language Models (LLMs) have taken the world by storm since the 2017 Transformers paper, but pushing them to the edge has proved problematic. Just this year, Google had to revise its plans to roll out Gemini Nano on all new Pixel models — the down-spec’d hardware options proved unable to host the model as part of a positive user experience. But the implementation of language-focused mo... » read more

Considerations For Accelerating On-Device Stable Diffusion Models

By Pat Donnelly - 30 Nov, 2023 - Comments: 0

One of the more powerful – and visually stunning – advances in generative AI has been the development of Stable Diffusion models. These models are used for image generation, image denoising, inpainting (reconstructing missing regions in an image), outpainting (generating new pixels that seamlessly extend an image's existing bounds), and bit diffusion. Stable Diffusion uses a type of dif... » read more

Unlocking The Power Of Edge Computing With Large Language Models

By Paul Karazuba - 30 Oct, 2023 - Comments: 0

In recent years, Large Language Models (LLMs) have revolutionized the field of artificial intelligence, transforming how we interact with devices and the possibilities of what machines can achieve. These models have demonstrated remarkable natural language understanding and generation abilities, making them indispensable for various applications. However, LLMs are incredibly resource-intensi... » read more

Generative AI: Transforming Inference At The Edge

By Paul Karazuba - 24 Aug, 2023 - Comments: 0

The world is witnessing a revolutionary advancement in artificial intelligence with the emergence of generative AI. Generative AI generates text, images, or other media responding to prompts. We are in the early stages of this new technology; still, the depth and accuracy of its results are impressive, and its potential is mind-blowing. Generative AI uses transformers, a class of neural network... » read more

A Packet-Based Architecture For Edge AI Inference

By Pat Donnelly - 27 Jul, 2023 - Comments: 0

Despite significant improvements in throughput, edge AI accelerators (Neural Processing Units, or NPUs) are still often underutilized. Inefficient management of weights and activations leads to fewer available cores utilized for multiply-accumulate (MAC) operations. Edge AI applications frequently need to run on small, low-power devices, limiting the area and power allocated for memory and comp... » read more

A Buyers Guide To An NPU

By Paul Karazuba - 22 Jun, 2023 - Comments: 2

Choosing the right AI inference NPU (Neural Processing Unit) is a critical decision for a chip architect. There’s a lot at stake because as the AI landscape constantly changes, the choices will impact overall product cost, performance, and long-term viability. There are myriad options regarding system architecture and IP suppliers, and this can be daunting for even the most seasoned semicondu... » read more

An Ideal Always-Sensing Subsystem Architecture

By Paul Karazuba - 25 May, 2023 - Comments: 0

Always-sensing cameras are a relatively new method for users to interact with their smartphones, home appliances, and other consumer devices. Like always-listening audio-based Siri and Alexa, always-sensing cameras enable a seamless, more natural user experience. Through continuous sampling and analyzing visual data, always-sensing enables use cases such as: “Find a face” detection for... » read more

Can Compute-In-Memory Bring New Benefits To Artificial Intelligence Inference?

By Paul Karazuba - 27 Apr, 2023 - Comments: 0

Compute-in-memory (CIM) is not necessarily an Artificial Intelligence (AI) solution; rather, it is a memory management solution. CIM could bring advantages to AI processing by speeding up the multiplication operation at the heart of AI model execution. However, for that to be successful, an AI processing system would need to be explicitly architected to use CIM. The change would entail a shift ... » read more

← Older posts

category: Inside Edge AI Processing

category: IoT, Security & Automotive

NPU Acceleration For Multimodal LLMs

Small Language Models: A Solution To Language Model Deployment At The Edge?

Vision Is Why LLMs Matter On The Edge

Considerations For Accelerating On-Device Stable Diffusion Models

Unlocking The Power Of Edge Computing With Large Language Models

Generative AI: Transforming Inference At The Edge

A Packet-Based Architecture For Edge AI Inference

A Buyers Guide To An NPU

An Ideal Always-Sensing Subsystem Architecture

Can Compute-In-Memory Bring New Benefits To Artificial Intelligence Inference?

Trending Articles

RISC-V’s Increasing Influence

Chip Industry Week in Review

Power Delivery Challenges For AI Chips

TSMC: King Of Data Center AI

Novel Assembly Approaches For 3D Device Stacks

Knowledge Centers
Entities, people and technologies explored

Related Articles

RISC-V’s Increasing Influence

Development Flows For Chiplets

New Data Center Protocols Tackle AI

Chiplet Tradeoffs And Limitations

Implementing AI Activation Functions

Die-to-die Interconnect Standards In Flux

The Best DRAMs For Artificial Intelligence

Future-proofing AI Models

Sponsors

Recent Comments

About

Navigation

Connect With Us

category: Inside Edge AI Processing

category: IoT, Security & Automotive

NPU Acceleration For Multimodal LLMs

Small Language Models: A Solution To Language Model Deployment At The Edge?

Vision Is Why LLMs Matter On The Edge

Considerations For Accelerating On-Device Stable Diffusion Models

Unlocking The Power Of Edge Computing With Large Language Models

Generative AI: Transforming Inference At The Edge

A Packet-Based Architecture For Edge AI Inference

A Buyers Guide To An NPU

An Ideal Always-Sensing Subsystem Architecture

Can Compute-In-Memory Bring New Benefits To Artificial Intelligence Inference?

Trending Articles

RISC-V’s Increasing Influence

Chip Industry Week in Review

Power Delivery Challenges For AI Chips

TSMC: King Of Data Center AI

Novel Assembly Approaches For 3D Device Stacks

Knowledge Centers Entities, people and technologies explored

Related Articles

RISC-V’s Increasing Influence

Development Flows For Chiplets

New Data Center Protocols Tackle AI

Chiplet Tradeoffs And Limitations

Implementing AI Activation Functions

Die-to-die Interconnect Standards In Flux

The Best DRAMs For Artificial Intelligence

Future-proofing AI Models

Sponsors

Newsletter Signup

Popular Tags

Recent Comments

About

Navigation

Connect With Us

Knowledge Centers
Entities, people and technologies explored