Experts at the Table: A look at the intelligence in devices today, and how that needs to evolve.
Semiconductor Engineering sat down to define what the edge will look like with Jeff DeAngelis, managing director of the Industrial and Healthcare Business Unit at Maxim Integrated; Norman Chang, chief technologist at Ansys; Andrew Grant, senior director of artificial intelligence at Imagination Technologies; Thomas Ensergueix, senior director of the automotive and IoT line of business at Arm; Vinay Mehta, inference technical marketing manager at Flex Logix; and John Sanguinetti, CTO of Adapt. What follows are excerpts of that conversation. To view part one of this discussion, click here.
SE: Until recently, almost all of the intelligence in AI systems was in the cloud. Now it is moving down to the edge. The problem is that AI algorithms are changing almost daily. So now we have these systems of things that are connected, but they’re not necessarily going to be in sync. Does the intelligence move to the network, and is it the network that gets updated or the devices on that network?
DeAngelis: There’s a whole span of intelligence in devices, depending on where they sit. It may be an appliance, a thermostat, or something that needs to be more flexible. We look at what is the basic level of information that’s required to be gathered by an application. We are able to extract certain types of parameters, which no matter what kind of intelligent algorithm you overlay, this fundamental information is going to be required to make decisions. In some cases we leave that to an embedded processor, where you can customize the processor for the application based on the overlay of what you need. In other cases, we just provide that basic information. If it’s more of an appliance, or a category of devices that we need to target, we can optimize a set of specific IP to customize and embed that into into the logic of the device. So it’s not flexible, it’s embedded, but it’s a lower-cost way of providing a higher-level type of product that’s smart. On the flip side of that, if you need to integrate the embedded processor, that gives you the ability to customize, depending on what needs to happen moving forward. So for us, what’s important is the basic level of diagnostic information that’s required — what’s going on with the device — depending on where it sits in the network. And then, finally, it’s this capability of providing the overlay of some type of algorithm. With the ability to start doing AI acceleration and learning, these types of ICs are becoming more prominent, as well. So there’s the ability to embed some of that technology into these devices, and you’re going to get the span from high level to more appliance-like devices.
SE: So it varies by application?
DeAngelis: Yes, it’s a spectrum of solutions. There are basically three categories. There’s the appliance, there’s something in the middle where you have OTP or MTP capability where you can customize the part. And then there is the high-end embedded processor, where you have complete flexibility. But that middle area is key, where you’re able to provide a core set of hardware that can then be customized or modified at the end of the line test for what that device wants to be. That’s how we get to economies of scale and low-cost solutions.
Ensergueix: The diversity of IoT is extreme. There are hundreds of applications and hundreds of segments. We we’ve been trying to reduce the complexity in the kind of workloads, more from a software point of view than what we have seen. This is common around what we call the three V’s — vision, voice and vibration. Vibration is all about sensing motion or quality maintenance. With voice, we’re seeing more and more voice-enabled devices chip. This is great in this time of the COVID-19 virus because you do not have to touch things for it to be activated. And with vision, we’re seeing more and more detection, or detection of a situation, being managed by a small device and taking the local decision out of it. You can scale this in terms of hardware designs. We are building more platforms where you can scale CPUs and NPUs. For example, with our NPU we can scale the number of neural network processors and we can scale the number of MACs from 32 up to 256. So if you have a base design, for each IP you can select the performance level you want. One of the big risks is that for each of new SoC, or each new ASIC, you are starting from scratch. You have to be careful about security. We feel much more comfortable having a very well-proven security architecture on the chip, and then re-using that without affecting the performance too much. For example, we have been promoting the platform security architecture at Arm, which is really all the best practices to build a secure device that will be connected to the Internet.
Grant: The use case is really, really important, and so are what sorts of networks can be run. If it’s facial recognition or something in a vehicle, or it’s a robot that’s disinfecting streets, each of these is similar but different. The ability to be flexible enough to run literally hundreds of neural networks, and as they keep changing, to be able to use your offline workflow to recompile and then perhaps push to device will be critical. For many of the people we talk to, the value is in what they can do with the platform that you’re giving them. Their differentiation comes from that. We used to joke that whenever a customer came to talk to you, you’d say you could do ‘this’ or you could do ‘that.’ But the next time they came in, their performance requirements would have doubled because each time they went away and thought, ‘It’s really interesting that we could do this. We never thought about that.’ And they would do their research and start increasing their requirements. This is great, because what we’re seeing now is the market is changing and developing in front of our eyes. That opens up lots of opportunities for everyone.
Chang: If you look at a Model 3 or Model S Tesla, those actually are intelligent devices. We are right at the beginning of this trend. Tesla is focused on uploading the inference engine. It’s already trained. It already can visualize and hear and do all kinds of sensing. But it’s not doing active learning on the spot yet. So for the intelligent device, during the next two or three years the focus will be on the inference engine and its adaptation on the server. Once you find you can you can adapt it and train it, after two or three years the focus will shift to active learning. That means you will need to have a more efficient ML or DL engine for these devices so you can enable them to do active learning on the environment. And that is a very essential characteristic. If you have a sensor or an intelligent device on Mars, it takes a long time for a signal to come back to Earth. But your survival may be determined in 10 seconds. You need to make some changes on the spot. That’s one of the new trends that will show up in a couple of years. People will be focusing on the active learning part of the intelligent device.
Ensergueix: We already see on some predictive maintenance projects with the kind of a sensor that learns what is a normal condition for function for a motor, for example. You plug in a very small device, which has an MCU, and it takes the first few hours or days to learn what is normal vibration in different environments. So we are starting to see the first the first steps for this, and we are excited to see even more learning capability.
Mehta: We are really focused on the inference portion of this, which from a nuts and bolts standpoint is far simpler than training. To train neural networks you need to support back-propagation as opposed to just inference. So using a crude approximation, you need to do half the amount of work to just make a decision, rather than to make a decision and then learn what is a better decision. For Tesla actuators, you are still giving the device a set of options and models to operate within. For an actuator, you do need to send that data back to a data center in order to re-train the model. It can tell when something is important and that a better decision could be made, but at this point it cannot make a better decision. That still requires a data center doing processing. But in terms of the actual workloads, what we’re seeing is there are a lot of these convolutional nets that require the same kind of base compute unit operation, which are easily done by a generalized matrix multiplication algorithm that can be easily decomposed into a set number of multiply accumulates. How you architect those multiplying accumulates will give you better performance for some specific class of matrix multiplications that you find in convolution 2Ds and some DSP filters, but maybe not so much in the LSTM (long short-term memory) voice recognition words generation class of artificial intelligence. So where is the application constrained or bounded enough where, if we give this to a customer, they are able to do something useful? But it also cannot be so unbounded that you can’t build an architecture to do that work.
SE: You’re designing for things in motion and things that may change over time. So it’s going to have to be fairly complex even if it is low cost. You’re interfacing with the real world, and now you have to deal with things like drift and aging. And at the same time, it all has to be integrated into something else. How do these conflicting pieces go together?
Chang: If you look at an intelligent device as a packaged system, you need to have digital and analog portions in the design. Each has a different concern. One is binary and the other is a continuous variable. In terms of the devices, they have different requirements. For the sensors and actuators to co-exist, and then to add intelligence into the design, you need to consider the system perspective and design every component to co-exist well. That includes consideration of environmental factors, like whether it is running at 40° or 155°. Is it in the harness of a vehicle or in an arctic environment? You have to pre-think all those conditions in the design in terms of the system level or the chip-package-system level. And of course, using multiphysics simulation, which right now is pretty advanced, you can consider most of the physics through simulation. That means you don’t need to really test it in a harsh environment. You can do the testing through the simulation. And then, through the learning of the simulation, you can build a reduced order model of the machine learning inference engine model. Then you can upload it to the intelligent device, and that goes with that device to a deployment location and becomes part of that device. It’s a grand scheme for the intelligent device, and if you can do some local learning, that’s even better. But you also need to have built-in security to make sure no one can crack the device and then get into into your network to the servers for the whole company.
Sanguinetti: We’ve been talking about all these these capabilities that we can see are needed in an edge devices in the future. That includes machine learning and intelligence and responding in a dynamic way to the needs of the particular application. We also have a vast set of problems that can be solved with things that we know about already. If you look at maintenance of electrical transformers, you can put a sensor in the vicinity of a transformer and analyze the gases coming off of it and tell when it is about to fail. It won’t fail immediately, so you have time to send the data back to the server and go out and fix it. But that doesn’t require machine learning. It’s already known what the capabilities are and what the composition of the exhaust gases are. There are a lot of things like that, which don’t really require all that much intelligence. Yet we still haven’t been able to make it economically viable to implement that kind of a capability for an electrical system. If you could do that for a reasonable price, you could automate the electrical grids of every electrical company in the world. There’s a gulf between where we are today and where things are going. In the EDA world, we’ve always understood that making a leap too far was was never a good idea, while with incremental change you could succeed. We may very well be in a situation like that now.
Related Material
Conflicting Demands At The Edge
Cost, power and security clash with the need for intelligence, localized processing and customization.
Revving Up For Edge Computing
New approaches to managing and processing data emerge, along with standard ways to compare them.
Memory Issues For AI Edge Chips
In-memory computing becomes critical, but which memory and at what process node?
Leave a Reply