Knowledge Center
Navigation
Knowledge Center

Voice control, speech recognition, voice-user interface (VUI)

Using voice/speech for device command and control.
popularity

Description

Speaking to a device is catching on as a way to interact with electronic devices, made popular by voice-enabled Internet searching. Common uses are internet searching with voice assistants (such as Alexa and Siri), real-time translations, device command and control (such as hands-free phone use in an automobile) and data collection. The main advantage of speaking into a device is you do not need to type a message, which usually takes longer than speaking. It also frees up your hands to do other things, such as hold on to a steering wheel.

Also known as voice-user interface (VUI), a VUI often works by sending the received speech to a data center for processing, possibly introducing some latency into an exchange.

The components that make speech work in a voice command device (VCD):

On the device: Hardware/software

  • Audio DSPs
  • Specialized silicon
  • Speakers
  • TX/RX
  • Certain neural networks and neural network algorithms are used for speech processing. Long-short term memory (LSTM) and recurrent neural network (RNNs) are variants of neural networks used for audio.

Off-device computing
Data center/cloud or edge computers where the voice signals can be processed quickly and return an answer.

An example of a speaker identification system is shown below from an ARM white paper.


Related People