Next Up: Touchless Screens

Gesture recognition will change the way we interact with our technology.

popularity

By Kurt Shuler
Gesture Recognition Qualcomm’s announcement this Monday that it has acquired assets from gesture recognition technology pioneer GestureTek makes it official: Gesture recognition based on video camera technology will be in phones sooner than we think.

Source: TI and YouTube

Source: TI and YouTube

Video-based gesture recognition technology for consumer electronics is an offspring of the field of computer (or machine) vision. This branch of technology has been commercialized by companies like Mobileye, which create vision-based Driver Assistance Systems for cars from BMW, Volvo and GM. Be sure to view the “This is what our processors see” video on Mobileye’s website to understand how it works.

In mobile phones, TI has implemented gesture recognition as part of its “natural user interface” set of technologies in the OMAP 4 platform. (You can see a cool video here of a person writing on the desk next to a phone and having the text appear on the phone screen.) Qualcomm’s acquisition will create more competition and innovation in user interfaces, leading to phones that are more intuitive for us to use.

There are two big questions with gesture recognition:
1. What is gesture recognition good for?
2. What does gesture recognition cost?

Why Gesture Recognition?
If you saw the TI video referenced above, then you have already seen some of the benefits. The big benefit to me is to be able to make a call or read an email on my phone without smudging the screen with my greasy paws. Having a virtual writing or typing surface is also a benefit because either one of these serves to expand the useful human interaction zone of a phone beyond its physical footprint.

One of the more interesting applications for gesture recognition could be speech recognition. Every time Dragon comes out with an upgrade to its NaturallySpeaking application, I dutifully upgrade in hopes that its speech recognition accuracy will increase enough for it to be useful to me. And every time I have been disappointed.

But what if my computer or phone could not only listen to my voice but also read my lips? Even though lip reading is accurate for only about 25% of the words in spoken English, could this bit of extra information help with speech recognition? I’m sure it couldn’t hurt.

What does Gesture Recognition Cost?
From a hardware standpoint, not much: Just a camera, a powerful processor or DSP, and some RAM. But the reality is more complicated.

On-Chip Quality of Service (QoS)
Adding gesture recognition to a phone increases the burden on the on-chip quality of service technology within the phone’s system on chip (SoC). The gesture recognition features add more than just greater throughput requirements requiring a “bigger pipe” in the SoC interconnect. They also add additional system initiators, which require low latency system responses to give the illusion that the phone is responding instantaneously to user input.

This means that the SoC on-chip interconnect and its QoS system must be able to instantly recognize high priority traffic and be able to shuttle it to its destination without impacting other system services. Think of an ambulance speeding through rush hour traffic, and you get an idea of the technical difficulty of implementing QoS properly.

Gesture recognition requires the data-intensive algorithms of edge detection, object model matching, viewed object morphing analysis, and real motion estimation. These include bursts of high-bandwidth direct memory access (DMA) transfers as well as cached data access by processors. Processor performance will degrade when its cache line refills are hammered by DMA traffic. Therefore, the ability to dynamically allocate bandwidth requirements between the different transactional agents in an SoC gives the processors low latency access, while still ensuring that all remaining bandwidth is efficiently allocated to the greedy DMAs.

Power Management
Properly implementing power management for a phone with gesture recognition features is very important because the end user must not notice any delay between gestures and system responses, all the while maintaining a low current draw on the battery. This means that phones implementing gesture recognition must have sophisticated hardware and software systems to quickly change the power state of the phone and its internal SoCs.

The challenge with power management is to turn off unneeded parts (or domains) of SoCs at the finest level of detail possible without increasing the complexity required to control these multiple power domains and states. Simple partitioning of the interconnect into power domains allows the creation of more power domains for better use-case controlled power savings.

Furthermore, fine granularity of clock gating is hugely beneficial. Modular interconnect design using small simple elements provides an opportunity for fine-grained clock gating.

Gesture Recognition is For Real
With two of the largest mobile phone application vendors pursuing gesture recognition, it is only a matter of time before we see people walking down the sidewalk, heads down and seemingly swatting at their phones without actually touching them. Although an anthropologist would think this behavior is strange, I see it as progress.

–Kurt Shuler is director of marketing at Arteris.