Deep-pocket companies begin customizing this approach for specific applications—and spend huge amounts of money to acquire startups.
Neural networking with advanced parallel processing is beginning to take root in a number of markets ranging from predicting earthquakes and hurricanes to parsing MRI image datasets in order to identify and classify tumors.
As this approach gets implemented in more places, it is being customized and parsed in ways that many experts never envisioned. And it is driving new research into how else these kinds of compute architectures can be applied.
Fjodor van Veen, deep learning researcher at The Asimov Institute in the Netherlands, has identified 27 distinct neural net architecture types. (See Fig. 1 below). The differences are largely application-specific.
Neural networking is based on the concept of threshold logic algorithms, which were first proposed in 1943 by Warren McCulloch, a neurophysiologist, and Walter Pitts, a logician. Research plodded along for the next 70 years, but then it really began to spike.
“The big bang happened in 2012-2013 when two landmark papers were published, both using GPUs,” said Roy Kim, Nvidia’s Accelerated Computing Group product team lead. One of those papers was written by Geoffrey Hinton and his team from the University of Toronto (and now half-time at Google also), entitled “ImageNet Classification with Deep Convolutional Neural Networks.” Then in 2013, Andrew Ng from Stanford (now also chief scientist of Baidu) and his team published “Deep Learning with COTS HPC Systems.”
Nvidia recognized early on that deep neural networks were the foundation of the revolution in Artificial Intelligence (AI) and began investing in ways to bring GPUs into this world. Kim pointed to convolutional neural networks, recurrent neural networks, and Long Short Term Memory (LSTM) networks, among others, each of which is designed to solve a specific problem, such as image recognition, speech or language translation. He noted that Nvidia is hiring hardware and software engineers in all of these areas.
In June, with a big news splash, Google upped the ante in the semiconductors for neural networks arm race. Norm Jouppi, a Google distinguished hardware engineer, unveiled details of the company’s several-year effort, the Tensor Processor Unit (TPU), an ASIC that implements components of a neural network in silicon—as opposed to using raw silicon compute power and memory banks and software on top of that, which is something that Google also does.
The TPU is optimized for TensorFlow, Google’s software library for numerical computation using data flow graphs. It has been running in Google data centers for more than a year.
It’s not just established players that are vying for a piece of this market, though. Start-ups Knupath and Nervana entered the fray in a quest to engineer fields of transistors with neural networks in mind. Intel paid a reported $408 million for Nervana last month.
The silicon “engine” that Nervana has been developing is due sometime in 2017. Nervana hints at forgoing memory caches because it has 8 Tb/second memory access speeds. But the fact that Nervana was begun with $600,000 in seed capital in 2014 and was sold for nearly 680 times that investment two years later is a testament to just how seriously the industry and financiers are taking the space—and how hot this market has become.
Automotive is a core application for this technology, particularly for ADAS. “The key is that you have to decide on what the image is and get that into a convolutional neural network algorithm,” said Charlie Janac, chairman and CEO of Arteris. “There are two approaches. One is a GPU. The other is an ASIC, which ultimately wins and uses less power. But a really good implementation is tightly coupled hardware and software systems.”
Getting to that point of tightly coupled systems, though, depends on a deep understanding of exactly what problem needs to be solved.
“We have to develop different options regarding the technology,” said Marie Semeria, CEO of Leti. “This is completely new. First, we have to consider the usage. It’s a completely different way of driving technology. It’s a neuromorphic technology push. What exactly do you need? And then, you develop a solution based on that.”
That solution can be incredibly fast, too. Chris Rowen, a Cadence advisor, said that some of these systems can run trillions of operations per second. But not all of these operations are completely accurate, and that has to be built into the system, as well.
“You have to adopt a statistical measure to govern correctness,” Rowen said. “You are using much more highly parallel architectures.”
What works best?
Comparing these systems isn’t easy, though. In discussing the Google TPU, Jouppi highlighted one way that teams of researchers and engineers around the world can benchmark their work and the performance of the hardware and software they are utilizing: ImageNet. ImageNet is a collection of 14 million images that university researchers maintain independently, and it allows the engineering teams to time how fast their systems find objects and classify (usually a subset of) them.
Later this month the results of the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 2016 will be released, as part of the co-located European Conference on Computer Vision (ECCV) taking place in Amsterdam.
All the players discussed here will be there, including Nvidia, Baidu, Google and Intel. Qualcomm will be there as well, as it starts to field the critical software libraries, just as Nvidia has been. Nvidia will be demonstrating its DGX-1 Deep Learning Appliance and the Jetson embedded platform that is tuned for video analysis and machine learning.
Arnold Smeulders and Theo Gevers, the general chairs of ECCV 2016, told Semiconductor Engineering that many of the attendees of ECCV do work in the area of semiconductor technologies (as opposed to software that runs on silicon) that enable computer vision.
“Recently, the semiconductor technology powerhouses have developed an interest in compute vision,” they said via e-mail. “Now, a computer can understand what types of things and types of scenes are present in an image. This requires a description of the image in features matched with machine learning from annotated examples. Over the last five years, the power of these algorithms has increased tremendously using deep learning architectures. As they are compute-intensive, chip manufacturers including Qualcomm, Intel, Huawei and Samsung, well as AI-firms such as Apple, Google, Amazon, and lots of high-tech start-ups enter the stage of computer vision as well.”
Smeulders and Gevers said interest has grown to such heights that the conference venue reached its maximum capacity a month before registration was expected to end.
Previous editions of ECCV grew by about 100 attendees each year over a long period, ending with 1,200 for the previous edition in Zurich. “This year, we have reached 1,500 attendees one month before the conference. At that point we had to close registration as the building we have hired, the Royal Theater of the Netherlands, cannot hold more comfortably,” they wrote.
With an industry this complex and changing, it is challenging for engineering managers to find the necessary skill sets. It also raises questions for electrical engineering and computer science students about whether their courses of study will be obsolete before they can apply them in the market. So what could a student interested in electronics, and particularly semiconductors role in computer vision, do to get a job in the field?
“As image [recognition] is a lot of data coming in every 1/30th of a second, it is only natural that there is attention from the semiconductor industry,” Smeulders and Gevers said. “Due to the massive amounts of data, until recently the attention was restricted to processing the image to produce another image (more visible, sharper, of highlighting elements in the image) or to reduce the image (a target region, or a compressed version). This is the field of image processing: image in, image out. But recently, the field of computer vision—that is, the interpretation of what is visible in the image—has gone through a very productive period. To understand the nature of images, the way that are being formed and the way there are being analyzed with deep networks, are the key components of modern computer vision,” they wrote.
Stochastic signal processing, machine learning, and computer vision would be areas of study and training.
The Asian counterpart of ECCV is Asian CCV. In the odd years, an International CCV is held. Smeulders and Gevers noted that the Computer Vision and Pattern Recognition conference, held every summer in the United States, complements the CCVs but has a slightly difference focus. Google considers it among the top 100 sources of research over all disciplines under review, the only conference in the list. Papers are due for on Nov. 15 for the July conference e in Honolulu.
Nvidia’s Kim and others consider 2012 to 2013 to be a “big bang” for GPUs being utilized in these neural network applications for tasks like computer vision. So what’s the next big bang?
“Since 2004 when the race for categorizing the image content began anew, a few algorithmic steps have paved the way,” Smeulders and Gevers wrote. “One was the SIFT-feature [Scale Invariant Feature Transform]. The other were bags of words and other coding schemes, and feature localization algorithms, as well as feature learning from large sets of image data. Then GPUs paved the way for the use of deep learning networks further enhancing the performance as they have sped up learning from datasets. That meant tasks that had required weeks could now run overnight. The next big bang will be on the algorithmic side again.”
Given the amount of money being poured into this market, there is little doubt that something big is ahead. How big remains to be seen, but given the amount of activity around this approach and the amount of investment pouring in, expectations are very high.
Convolutional Neural Networks Power Ahead
Adoption of this machine learning approach grows for image recognition; other applications require power and performance improvements.
Inside AI And Deep Learning
What’s happening in AI and can today’s hardware keep up?
Inside Neuromorphic Computing
General Vision’s chief executive talks about why there is such renewed interest in this technology and how it will be used in the future.
Neuromorphic Chip Biz Heats Up
Old concept gets new attention as device scaling becomes more difficult.
Five Questions: Jeff Bier
Embedded Vision Alliance’s founder and president of BDTI talks about the creation of the alliance and the emergence of neural networks.