Spreading Intelligence From The Cloud To The Edge

Explosion of data is forcing significant changes in where processing is done.


The challenge of partitioning processing between the edge and the cloud is beginning to come into focus as chipmakers and systems companies wrestle with a massive and rapidly growing volume of data.

There are widely different assessments of how much data this ultimately will include, but everyone agrees it is a very large number. Petabytes are simply rounding errors in this equation, and that soon will be replaced by exabytes at current growth rates.

“An autonomous vehicle will produce 15 terabytes of data per hour from sensors,” said Sumit Gupta, vice president of HPC, ML and AI at IBM, in a presentation at the recent Autonomous Vehicle Hardware Summit in San Jose. “When you take all the data from ADAS testing, that can get to 500 petabytes of data. Now, when we have to move it to a research center, the fastest way to move it is on a truck down the highway.”

Even storing that much data goes well beyond the capabilities of the largest supercomputer, which currently has about 200 petabytes of storage, Gupta said.

To handle all of these bits, at least some processing has to be done at the edge. It takes far too much time, energy and money to move it all—and the bulk of it is useless. But so far there is no agreement on how or where this will be done, or by whom. Cloud providers still believe hyperscale data centers are the most efficient tool to grind down the mountains of operational data produced by IoT devices every day. Device makers, in contrast, believe they can pre-process much of that data at or close to the source if they can put a smart enough, purpose-built machine learning inference accelerator in the device.

“The edge is becoming more and more intelligent,” said Lip-Bu Tan, president and CEO of Cadence. “Sending everything to the cloud is too slow, so you’re going to see the edge starting to take off. The hyperscale cloud will continue to explode, but for automotive and industrial the activity will be at the edge. The next big thing is the edge.”

Definitions vary about what constitutes the edge. “The edge is between the IoT device and the cloud,” said Tan. “It’s a mini-cloud, but it’s not so massive and it will be energy-efficient. There will be an automotive cloud and different vertical clouds.”

One of the new terms is edge clouds. But that’s not the only place where intelligence will be added in order to process some of this data.

“Broadly speaking, what we’re seeing is that compute is happening all along the infrastructure, so it would be disingenuous to try to call out one specific area and say that’s where the intelligence is being added,” according to Mohamed Awad, vice president of marketing for Arm’s Infrastructure Business. “There is a heterogeneity in compute, one dimension of which is the idea that you have this influx of data from trillions of devices that has to be processed efficiently and transported through the infrastructure. But there is another dimension that is changing the very nature of  compute. The classic model where you have a general-purpose CPU and you write your workload once and then sit back and let Moore’s Law keep you moving forward has changed.”

So far, there is no agreed upon demarcation point for where the edge begins or ends. For some experts, it starts at the sensor. For others, it may be a corporate campus or some other private cloud.

“The problem with defining this is there is no incumbent instruction set in the edge,” said Steven Woo, Rambus fellow and distinguished inventor. “So we’re seeing x86 architectures pushing down, and Arm-based architectures pushing up. The real issue, though, is that if the end point is generating data faster than network links improve in performance, then you don’t have a choice but to do more processing closer to the device. This is why every cell phone processor now has a neural network processor in the core, because some of this has to be processed at the end point.”

Moving data
This has set off a scramble on the infrastructure side to enable “pervasive intelligence,” an updated version of IBM’s old “pervasive computing” idea. The basic problem is that the existing infrastructure never will be fast enough or reliable enough to send all of the data from a moving car to the cloud and back again in time to avoid an accident, even with millimeter-wave 5G. But it’s not out of the question to send a warning from one vehicle to another that isn’t visible yet to avoid an object in the road or to slow down because there is an icy patch around the next bend.

“The key is reducing latency,” said Mike Fitton, senior director of strategic planning at Achronix. “There will be more and more functionality pushed toward the edge.”

5G will help significantly in this regard. So will a reduction in the amount of data that needs to be sent. “With machine learning, the goal is sparsity of data,” Fitton said. Being able to slim that down will help with both storage and power. “Basically, what you’re doing is removing zeroes, and FPGAs are really good for that.”

That’s only one piece of the puzzle, even though it’s a necessary one. Fitton noted that during a 12-hour flight a commercial airplane can generate 844 terabytes of data.

Figure 1. Fog computing supporting a cloud-based ecosystem for smart end-devices. Source: National Institute of Standards and Technology.

Faster Ethernet and SerDes will keep things moving when it needs to, as well. In fact, the speed at which data will travel is set to increase from 50 gigabits per second today to about 400 gigabits per second by 2021, and it could double again with optical technology, according to Andy Bechtolsheim, chairman and chief development officer at Arista Networks. What’s particularly significant in this market is that today’s leading-edge switching technology was developed at 28nm using custom hardware. But the switch market is expected to shift to 7nm merchant chips over the course of the next two years. Because those chips are less expensive than custom silicon, he said it will cause a “dramatic ramp up in a very short time” of both performance and throughput.

“There is much more room for improvement,” Bechtolsheim said during a presentation at CDNLive this week. “We will triple the number of transistors.”

Other economic factors are at play here, too. He said that moving to new silicon will cause a 10X decrease in cost per port. At the same time, a single 400Gbps pipe is cheaper to develop and maintain than 4–100Gbps pipes. This is going to drive continued movement of data everywhere, and while more data certainly will be processed locally, more data will still be moving to the cloud than is processed on premises.

Securing data
Also entering into the edge discussion is how to secure that data. The general thinking among security experts is that data is at greater risk when it is in motion, regardless of whether that risk includes outright theft or simply data leakage.

But there also is a growing fear that data encrypted today can be hacked at some point in the future using more powerful computers and better algorithms. “Imagine that someone is storing all of the encrypted traffic,” said Taher Elgamel, a cryptography expert and winner of this year’s Marconi Prize. “If you stockpile data, some of that will still be valid in 25 years.”

The difference is that what is considered state-of-the-art encryption today will be relatively easy to crack with future generations of computers. And the less data that is in motion, the less likely someone will be able to collect it and store it.

“There are two main ways to add security—key security and encryption” said Tejinder Singh, general manager of security solutions at Marvell Semiconductor. “Today, encryption is more popular. But if you have the key, you can decrypt that data.”

This sounds straightforward enough, but it gets significantly more complicated when data is partitioned between the edge and the cloud. “What’s needed is a more flexible solution because you want to provision each application differently. Partitions need to be isolated. So if you have 32 partitions, keys may belong to Domain 1 and not Domain 2. It’s up to each user to decide if the key should be share with other users. It also allows you to shut down a partition with a single command.”

Marvell started out working with the public cloud, but it has since migrated this thinking into the private cloud. Along with that, a unique identifier is injected into a chip during manufacturing to create keys for that device. “You can do this for multi-core processors, too, where each core does something different,” he said.

Architecting a system
Putting all of these pieces together creates some new opportunities for chipmakers.

“Edge processing has the potential to decrease overall power consumption,” said Jeff Miller, senior product marketing manager at Mentor, a Siemens Business. “The radio can be a significant portion of the power in a device. If you decrease the amount of data you send back to the cloud, that takes less power. So if you have a satellite temperature sensor for a thermostat, it only needs to transmit data when it hits a certain temperature, not all of the data along the way. And with a security device, you can minimize the number of false positives.”

But more intelligent edge design also adds some interesting challenges.

“There are a lot of very interesting processing units getting pulled in to various systems — GPU, CPU, TPU — that are being used to make sure you can put the right kind of cost at the right point to get the result you want without using too much power or generating too much heat or being too expensive,” said Gilles Lamant, distinguished engineer at Cadence. “That’s important in the data center, but there you have plenty of power and cooling. In other kinds of devices you have to look at a wider range of factors, and there are more types of processors to choose from. But people are using optical or ASICs or something else for a specific reason—they ran out of bandwidth or are using too much power or creating too much heat. What you can get away with changes a lot, depending on if you’re in the data center or if you’re talking about a smartphone or a self-driving vehicle.”

The flood of special-purpose accelerators from companies like Google, Alibaba and Facebook, as well as chipmakers, is a recognition that general-purpose chips are not sufficient to achieve the necessary performance for inferencing in a wider range of devices, or to add vision or speech recognition in cars, doorbells and smartphones.

“Today’s infrastructure  is primarily designed for video distribution – delivering streaming video to billions of people, downstream only,” Awad said.“You still probably want a camera that only notifies you when a door is open or someone walk through. But beyond the endpoint, think of the change in architecture required when you have a trillion connected devices and a billion or so of them are cameras or other  devices pushing video upstream into the infrastructure, not just down toward the endpoint.”

Cisco, Arm, Samsung, Philips and many of the other big players in the IoT and infrastructure market have expanded aggressively into data management, focusing exclusively on chips, according to Bill Hoffman, president of the Industrial Internet Consortium—an IIoT product-testing organization that recently merged with the OpenFog Consortium. (The latter was instrumental in establishing standards for architecture and performance and sometimes the definition of many aspects of fog computing.)

But even looking just at inferencing in endpoint devices doesn’t make processor choices automatic, said Geoff Tate, CEO of Flex Logix. The performance of inference models created from deep neural-networks trained on large datacenter servers depends on the algorithms, but also whether parts of the matrix multiplication is handled using 16-bit or 8-bit integers, or floating-point calculations — or whether the chip is designed in such a way that it can’t get up to full performance for the first tenth or quarter or third of a training cycle in which the same matrix calculations are run tens of thousands of times. What’s needed are standards defining issues such as latency, and the architecture of compute-intensive tasks that could be handled somewhere other than an endpoint or the cloud.

Put in perspective, the need to collect and efficiently scrub, massage and analyze large amounts of data isn’t new. But questions about which device, and in what context, data needs to be analyzed never really affected large-scale systems or processor-purchase decisions in the past, said Susheel Tadikonda, vice president of networking and storage at Synopsys.

Other issues
Sending everything to the cloud  or a centralized data center was always an obvious and largely bulletproof decision. In fact, it only became a point of discussion when the real or potential volume of data grew high enough to make bandwidth and cloud-processing costs a bigger part of the discussion.

No one would be talking about having security cameras or gateways that act as server for 1,000 temperature sensors play a significant role in data processing if the number of devices collecting or contributing data hadn’t become so extreme.

“With a limited number of devices, there’s no point in trying to invent a new way to process the data,”Tadikonda said. “But the number of devices is growing so fast that, even if each only produces a little data, they’re producing a lot more in total. That changes the equation a little.”

Some of the end-user customers that own those IoT devices have been collecting large amounts of data for decades, but only in the last few years has it been possible to analyze that data well enough to get the kinds of benefits in predictive maintenance and supply-chain efficiency that big companies can benefit from now, according to Robert Golightly, product marketing lead at Aspen Technology, which develops tools to squeeze greater efficiency from industries that are already quite good at efficiency.

“Companies in process-focused industries—chemical production, oil, mining—have been collecting terabytes of data per day on their own equipment and operations for probably 20 years,” Golightly said. “The problem for them wasn’t collecting the data or knowing what to do with it or what device to use. It was mostly a question of different people owning different parts of the process and not being able to commit across those divisions. Once a few of them figured out how to cross that barrier, I saw customers you expect to have a sales cycle of 18 months getting projects underway in less than six. Unplanned downtime is a $1.4 trillion per year problem. That’s a good motivator.”

The benefits are often clear enough, but the logistics, architecture, design, cost models and performance metrics are often missing. This is especially true for the IT people who often take the lead on the non-traditional parts of an IIoT project, according to  Matt Vasey, director of AI and IoT business development at Microsoft and former chairman and president of the OpenFog Consortium.

“That’s one reason the OpenFog consortium focused on creating a standard,” Vasey said. “Manufacturing has been one of the first industries to make inroads into the IIoT because it was understood how to automate those processes, which were often documented,” Vasey said.

“For other types of organizations there aren’t a lot of metrics to help define what works. We see a lot of models with accelerated compute modules, like GPUs at the edge, because they’re near the control plane. But we’re also seeing more models that push autonomy and AI into endpoint devices that have the potential to automate things that weren’t addressed before, while connecting them via OpenFog architectures to cloud applications so they don’t lose the processing efficiency of the cloud.”

Architectures and ideas about them are all over the place right now. Not every organization puts inference or storage or processing power at every tier of their infrastructure today, but the volume of compute resources in those tiers and in edge computing gateway server facilities is growing.

“We see tiering all the way from the endpoint to the cloud data center, and at every stage along the way compute requirements are increasing,” said Arm’s Awad. “So there can be some level of filtering and analysis of that data at every point it touches in the infrastructure, from the time it’s onboarded through a gateway all the way to the cloud.”

Related Stories
Planning For 5G And The Edge
Understanding 5G’s benefits, limitations and design challenges.
Using AI Data For Security
Pushing data processing to the edge has opened up new security risks, and lots of new opportunities.
Edge Inferencing Challenges
Balancing different variables to improve performance.
IIoT Edge Is A Moving Target
Industrial Internet Consortium defines use scenarios, not standards, of IIoT edge computing.
New 5G Hurdles
Getting 5G standards and technology ready are only part of the problem. Reducing latency and developing applications to utilize 5G have a long way to go.
Challenges At The Edge
Real products are starting to hit the market, but this is just the beginning of whole new wave of technology issues.

Leave a Reply

(Note: This name will be displayed publicly)