SPONSOR BLOG

Speech Applications Will Enable A New Category Of Edge AI Chips

Full speech recognition will require fundamental innovations that allow processing at very high performance per watt.

August 4th, 2022 - By: Anand Joshi

Speech recognition has become an increasingly important feature in a wide range of devices. Wakewords such as Alexa or OK Google or Siri have now become a standard feature of wearables, smart-speakers, mobile phones, and even laptops. These devices have already shipped in millions of units and consumers are getting better at utilizing this feature. The wakeword recognition feature is slowly evolving into keyword personalization. Wakeword personalization enables user to set their own word to wake the device. A further extension of this feature involves command recognition. Command recognition enables devices to recognize dozens of spoken commands.

Speech can come from a wide range of environments that could be noisy, or windy. Emerging speech-related use cases deal with background noise elimination, or speech enhancement, or active noise cancellation. An example of such a use case would be a device eliminating vacuum cleaner noise altogether, while running in the background. Some device manufacturers are even thinking about enabling complete speech recognition capability on a device. A device with such a feature would be able to listen to questions from the user, understand context, and provide an answer. For example, one could ask a microwave about the settings needed to best microwave popcorn, and the device could come back and describe settings.

Speech processing is computationally expensive. The range of compute for speech applications can range from MegaOPS to GigaOPS (or even higher if there’s no compression) on the edge. Any chip supporting speech application must provide the necessary compute within the performance, power, and cost envelope dictated by these devices. This is providing AI chip companies a new market to grow.

There are several challenges these chips must overcome to make this a sustainable, long-term business. First, there’s the performance per watt limitation that is particularly critical for battery-powered devices. The chip must provide the highest possible compute within the available energy to enable efficient processing of speech. The chip also should conform to performance requirements such as latency within the required performance per watt envelope.

Today’s popular chips for wakeword detection are based on CPU/DSP architectures. Companies such as Syntiant, Synaptics, and Ambiq that have shipped multi-million units use CPU/DSP architectures. Some other vendors such as Analog Devices use systolic arrays for AI algorithm acceleration. However, it will be extremely hard, if not impossible, for these architectures to scale within the energy envelope for the compute required for full speech recognition, given the limitations imposed by semiconductor node physics. To get to the level of compute needed to enable full speech recognition, some fundamental innovations may be necessary. New architectures that allow processing at very high performance per watt, such as Processing in Memory (PIM) or Legandre Memory Unit, might be necessary.

Then there’s the Bill Of Materials (BOM) restriction. Speech is one of the many features available on these devices, and OEMs must balance the available budget with silicon spend. It is unclear if OEMs would be willing to pay separately for a new class of chips that enable speech functionality and, if so, how much. This might put a limit on Average Sale Price (ASP) of such chips. OEMs might demand maximum functionality at the lowest possible price. For example, a complete speech recognition functionality at, say, $1 in comparison to only wakeword detection at same price might be requested. Today the industry stands at the latter price point.

Additionally, there’s the challenge of evolving algorithms. Speech applications could use classic, shallow machine learning algorithms or modern, deeper neural network-based algorithms. Speech application pipelines could also demand support for speech decode/encode and DSP. All these concepts are relatively new and evolving. Mapping these software algorithms to a given chip architecture poses a great challenge. Compressing and optimizing for the best performance poses an even larger challenge.

The trend is nevertheless positive as of 2022 and OEMs are announcing products and vendors funding rounds. To make the most of the opportunity, the chip world must overcome several challenges and keep innovating. In time, we will know whether speech applications will lead to a new category of AI chips.

Anand Joshi

(all posts)
Anand Joshi is a semiconductor industry executive with 25+ years of industry experience. He is a recognized expert in artificial intelligence community and speaks frequently in conferences regarding AI technology, markets and products. His market research reports from Tractica/Omdia on computer vision, AI chipsets and data center infrastructure have been used by top semiconductor and OEMs for strategic planning purposes since 2015. He’s been quoted by Bloomberg among others. His career spans Synopsys, LSI Logic, and Poseidon Design Systems. He holds MSEE from Virginia Tech and MBA from UC Irvine.

Speech Applications Will Enable A New Category Of Edge AI Chips

Anand Joshi

Leave a Reply Cancel reply

Technical Papers

Knowledge Centers
Entities, people and technologies explored

Related Articles

Global IC Fabs And Facilities Report: 2024

EUV’s Future Looks Even Brighter

Impact of Extremely Low Temperatures On The 5nm SRAM Array Size and Performance

Linear Pluggable Optics Save Energy In Data Centers

Chip Architectures Becoming Much More Complex With Chiplets

Startup Funding: Q1 2025

Interconnects Approach Tipping Point

Advanced Packaging Fundamentals for Semiconductor Engineers

Sponsors

Recent Comments

About

Navigation

Connect With Us

Speech Applications Will Enable A New Category Of Edge AI Chips

Anand Joshi

Leave a Reply Cancel reply

Technical Papers

Knowledge Centers Entities, people and technologies explored

Related Articles

Global IC Fabs And Facilities Report: 2024

EUV’s Future Looks Even Brighter

Impact of Extremely Low Temperatures On The 5nm SRAM Array Size and Performance

Linear Pluggable Optics Save Energy In Data Centers

Chip Architectures Becoming Much More Complex With Chiplets

Startup Funding: Q1 2025

Interconnects Approach Tipping Point

Advanced Packaging Fundamentals for Semiconductor Engineers

Sponsors

Newsletter Signup

Popular Tags

Recent Comments

About

Navigation

Connect With Us

Knowledge Centers
Entities, people and technologies explored