AI makes interesting reading, but physics will limit just how far it can go and how quickly.
In 1993, Vernor Vinge, a computer scientist and science fiction writer, first described an event called the Singularity—the point when machine intelligence matches and then surpasses human intelligence. And since then, top scientists, engineers and futurists have been asking just how far away we are from that event.
In 2006, Ray Kurzweil published a book, “The Singularity is Near,” in which he extended the hypothesis that artificial intelligence (AI) would enter a ‘runaway reaction’ of self-improvement cycles. He suggested that with each new and more intelligent generation appearing more and more rapidly, that it would cause an intelligence explosion resulting in a powerful superintelligence that qualitatively surpasses all human intelligence. Various dates have been assigned to when this would happen with the current consensus being 2040, but even Elon Musk fears that in five years “AI will have become dangerous.”
But it’s not clear that AI, and the march toward the Singularity are even close to reality. In five years, we may have one more technology node under our belt, meaning that we can expect twice the number of transistors that we have today. But while power per transistor may drop a little, heat will continue to limit what can be done with those chips. Many chips today cannot use all of the compute power at the same time due to thermal limitations.
If we rewind the clock a few of decades we can trace what got us to this point.
The heart of computing
At the heart of every advance has been an advance associated with the multiply operation, along with the ability to move data into and out of those multipliers and to have an element of programmability associated with them.
“The multiply is the most noticeable arithmetic operation, and plays a central role in the computation of many essential functions — filters, convolutions, transforms, weightings,” says Chris Rowen, CEO of Cognite Ventures. However, Rowen always warns against ignoring the other aspects mentioned.
The first major advance was wireless communications and the rise of the Digital Signal Processor (DSP). It provided single-cycle multiply operations, which until then only had been available in fixed-function hardware. “Wireless communications used to be seen as the epitome of hard compute problems,” says Samer Hijazi, senior architect in the IP Group of Cadence. “It has been and continues to be one of the hardest compute problems. It is an NP-complete (nondeterministic polynomial-complete) problem. The DSP gave you a wide array of multipliers, specifically an array of fixed-point multipliers. How many bits can you trust and use? As people learn more about what is needed, the type of accuracy needed is evolving.”
As applications get more complex, they tend to use a rich variety of arithmetic. “The computation often uses a mix of bit precisions (8b, 16b, 32b, and sometimes odd bit-lengths) and a mix of data formats (integer, fixed point, floating point),” explains Rowen. “This means that an implementation typically needs sufficient flexibility to cover a mix of arithmetic operations, bit precisions and data formats — not just a single form of multiply — to handle the necessary computation without compromising accuracy, efficiency or programmer productivity too much.”
The birth of AI
Artificial intelligence always has been an element of Science Fiction and this, like many other things in the technology world, does have an impact on the course of development. “For AI, there is one algorithm that has made a big comeback and has enabled the whole industry to rise again,” says Hijazi. “It is an algorithm from the late ’90s called Convolutional Neural Networks (CNN).”
At the crux of it, convolution is just a 3D filter. “It performs a repeated filter that is applied to an entire scene,” explains Hijazi. “It is looking for a specific pattern that you are correlating with every location in the scene and trying to see if it exists. You are doing multiples of patterns at a time and you are doing it in layers. In the first layer, you are looking for some pattern and creating a pattern correlation map or a feature map and then running another correlation map on the first map produced, and so on. So, I am building a sequential pattern layers on top of each other. Each of them is limited in some field of view.”
Convolutional Neural Networks were first developed by Yann LeCun while director for the NYU Center for Data Science. He is currently director of AI research for Facebook. The first application was an attempt to recognize the zip codes on letters. “It did not become mainstream because they did not have the necessary compute power,” points out Hijazi. “It was only the availability of massive GPUs that it became possible to show the superiority of the algorithms over the ones that had been developed by the experts.”
But while the multiplier may be important, it just one piece of a system. “Even an extreme vision processor, built to sustain hundreds of multiplies per cycle for convolutional neural network inner loops, dedicates little more than 10% of the core silicon area to the multiply arrays themselves,” says Rowen. “The other area is allocated to operand registers, accumulators, instruction fetch and decode, other arithmetic operations, operand selection and distribution and memory interfaces.”
The modern-day graphics processing unit (GPU), which is being used a lot for implementation of CNNs, also has an extensive memory sub-system. “Another piece that is essential for graphics is the massive hierarchical memory sub-system where data is moving from one layer to another layer in order to enable smooth transitions of pixels on the screen,” says Hijazi. “This is essential for graphics but not as needed for AI tasks. It could live with a memory architecture that is less power hungry.”
Another solution being investigated by many is the Field Programmable Gate Array (FPGA). “FPGAs have many DSP slices and these are just an array of fixed point multipliers,” continues Hijazi. “Most of them are 24-bit multipliers, which is actually three or four times what is needed for the inference part of deep learning. Those DSP slices have to be coupled to the memory hierarchy that would be utilizing the FPGA fabric to move the data around. The power consumption of an FPGA may not be that much different from a GPU.”
Rowen provides another reason for favoring programmable solutions. “Very few applications are so simple and so slowly evolving that they can tolerate completely fixed-function implementations. Programmability may come in the form of FPGA look-up tables and routing, or in the form of processors and DSPs, but some degree of programmability is almost always required to keep a platform flexible enough to support a set of related applications, or even just a single application evolving over time.”
But those DSP slices in the GPU and FPGA may not be ideal for AI. “It may be possible that only 4-bit multiplication is necessary,” says Hijazi. “So the race to reduce the cost of the multiplier is at the core of how we can advance AI. The multiplier is expensive, and we need a lot of it. It limits the flexibility of this newfound capability.”
It would seem likely that chips dedicated to AI will be produced. “2017 will see a number of chips targeted at AI and several demonstrable technologies by year end,” predicts Jen Bernier, director of technology communications for Imagination Technologies. “As companies develop chips for AI, they need to consider the increased demands to process data locally and relay data to the cloud for onward processing and data aggregation.”
The reality today
So how close to the Singularity are we? “The algorithm that we are using today was created in the ’90s and has created a lot of hype in the media,” says Hijazi. “But this all stems from one algorithm and its ability to solve one interesting problem — computer vision. The hype about extrapolating this capability has created a lot of enthusiasm, and the media loves the original premise of AI from the ’50s that may be coming to roost. AI did not make a significant leap. One algorithm was developed that enabled one advance.”
People are finding ways to use that algorithm for other tasks, such as Google using it to play the game of Go. Another example is related to voice recognition. “Virtual assistants will be virtually everywhere,” says Bernier. “Voice recognition and interaction will be incorporated into an increasing number of devices and we’ll see new classes of hearable devices. The technology will continue to evolve for more and more interactivity.”
Other advances expected in this area are discussed in the Predictions for 2017.
But does any of this directly lead us to the singularity? Would an AI have been able to invent the algorithms or the hardware structures that got us to this point? AI may well be able to help us optimize what we have, but that is not the Singularity. Engineers, it would seem, see the future in a more rational manner.
What Cognitive Computing Means For Chip Design
Computers that think for themselves will be designed differently than the average SoC; ecosystem impacts will be significant.
Convolutional Neural Networks Power Ahead
Adoption of this machine learning approach grows for image recognition; other applications require power and performance improvements.
Neural Net Computing Explodes
Deep-pocket companies begin customizing this approach for specific applications—and spend huge amounts of money to acquire startups.
Happy 25th Birthday HAL!
AI has come a long way since HAL became operational.