Cloud Vs. On-Premise Analytics

Not all data analytics will move to the cloud, but the very thought of it represents a radical change.

popularity

The immense and growing volume of data generated in chip manufacturing is forcing chipmakers to rethink where to process and store that data.

For fabs and OSATs, this decision is not one to be taken lightly. The proprietary nature of yield, performance, and other data, and corporate policies to retain tight control of that data, have so far limited outsourcing to the cloud. But as the amount of data rises, along with demands by more engineers to access data in real time, the chip manufacturing and packaging sector is beginning to re-evaluate its rigid policies.

Several underlying changes could have a big impact on where some or all of that data ultimately resides:

  • Pricing. Regardless of whether on-premises or a cloud-solution hosted by a third party, the costs of managing immense quantities of data are equally immense.
  • Data sharing. There are big benefits to sharing data across the various places a device has been manufactured, as well as their behavior in the field. To reap that benefit data needs to be uploaded to a central place, and cloud companies have the most up-to-date infrastructure and lowest latency.
  • Security. While many companies believe that it’s better to physically control data on-site, the reality is that cloud facilities often have better security because that’s their core business.

“We have seen the evolution of data storage architectures, from unstructured file folders to SQL databases, to Oracle RACs, to Data Appliances, to on-premises Hadoop, to the Cloud,” said Rao Desineni, director of data analytics for manufacturing and operations at Intel. “Without getting into the details of each, suffice it to say that the semiconductor industry has been continuously piloting and adopting efficient data management solutions to drive more analytics for yield and quality, and to improve engineering productivity and cost.”

On-premise data computing warehouses have been the norm in the semiconductor industry. But the cloud provides scalability, flexibility, and IT support for semiconductor companies to manage surges in computing/data storage needs and to provide data management expertise.

Not everything will move there, of course. Support is growing for a hybrid model, in which huge amounts of data will be processed and stored in the cloud to generate the predictive models to be used locally at the manufacturing or test decision point.

“Semiconductor test data-management implementations need to use both on-premise and cloud environments, and to be able to actively manage data across them,” said Keith Schaub, vice president of technology and strategy at Advantest America. “Data and analytics leaders are dealing with vastly more complex data management landscapes that need to deal with data movement and access, as well as management and deployment environments. Much of the existing infrastructure is pre-DEX (Data Exchange) era, and as such needs to be upgraded and integrated into a coherent data management ecosystem.”

How much of that processing moves to the cloud, and how quickly, varies by company.

“We still see a lot of reluctance of semiconductor companies to go into the cloud solutions. It’s really smaller companies that opt to do so,” observed Dieter Rathei, CEO of DR Yield. “This is a recent development. The first time that a customer explicitly preferred a cloud solution was in 2017. Their small IT department did not want the responsibility for the data management. This remains a rare request, though.”

GlobalFoundries, meanwhile, has begun to consider alternatives. The amount of test data per wafer has grown from hundreds of megabytes per wafer to tens of gigabytes. “That’s leading us to go explore other technologies, instead of just our on-premise capabilities,” said David Partyka, software applications manager at GlobalFoundries. “To cope with the data volume, we’ve got to start looking at how do we leverage cloud technology.”

Writing on the wall
While any change to corporate data management policies would be a big step, the underlying driving forces have been building for some time. In fact, for the analysis of semiconductor manufacturing data, from fab through final test, the ability to connect data across different factory data sources has been steadily improving over the past decade.

“Companies that run on the cloud use a wider ecosystem of suppliers at different facilities and territories,” said Nir Sever, senior director of product marketing at proteanTecs. “The globalization of manufacturing services means that many of these processes are carried out by different providers across many facilities and geographies, creating separated and siloed data sets.”

Breaking down these silos is critical to improve reliability in increasingly heterogeneous designs, particularly in safety- and mission-critical markets.

“With the cloud you now have centralized applications that ingest data from different sources,” confirmed Aftkhar Aslam, CEO of yieldWerx. “This helps in the sense that you now basically can collect data locally, but then you can transfer it to a cloud for it to be semi-permanently stored while the analysis tools execute.”

He’s not alone in seeing that. “While increasingly higher volumes of data are being generated, a scalable data volume storage architecture is required, and soon. The need for native cloud compute resources is becoming increasingly important,” said Paul Simon, group director of silicon lifecycle analytics at Synopsys. “Enabling a collaborative environment is necessary, where not just data can reside and be analyzed, but one that can be easily shared and viewed across all levels of engineering and management for quick detection and root cause identification of issues.”

Connecting data from semiconductor devices now extends to device performance and behavior at the system level. This expands the scope and timescale of the data to be analyzed, which is why companies such as proteanTecs, Siemens EDA, and Synopsys now offer silicon lifecycle management products and platforms. And most analytics platform include feedback from customer returns, which need to be looped into that data and analyzed to prevent costly failures in the future.

Who needs what for analytics
The need to move from on-premise to cloud computing depends on a number of factors, such as where engineers sit in the data supply chain, when they need to act on data, and what they actually need from data analytics.

At a fundamental level, the complexity of the semiconductor manufacturing process drives the need to learn from the data generated during a device’s creation. That information can be applied to improving product yield and quality, identifying process excursions, and managing poorly performing equipment. But add in data from an end customer’s system, and engineers are faced with not just more data, but more complexity in determining a predictive model that is helpful for their engineering goals.

Fig. 1: Semiconductor data supply chain-data sources. Source: Anne Meixner/Semiconductor Engineering

Fig. 1: Semiconductor data supply chain-data sources. Source: Anne Meixner/Semiconductor Engineering

Depending upon an engineer’s role in the semiconductor supply chain, and the analysis they perform, they may not need cloud-based services.

“If I’m a yield engineer, or process engineer working solely with data within my factory to make decisions, then I’m perfectly happy with on-premises solution,” said Mike McIntyre, director of software product management at Onto Innovation. “But if I’m the product engineer for a multi-chip module, I want to understand yield and quality issues. I’ve got die coming from four different factories, being assembled in two different places, ultimately into one system. The only place I could potentially get to it effectively is in a form of aggregated data store. The cloud presents one technical choice for this data store.”

It’s also a way for companies with tight budgets to ramp up capabilities normally limited to the biggest companies with very deep pockets.

“The best argument for cloud solutions is for small and fabless companies,” said DR Yield’s Rathei. “The reason is both economical and technical. As a fabless company your data is generated elsewhere — fab foundry, external test house — and needs to travel through the net. So then it makes sense to install everything in the cloud. Also, because smaller companies tend to have smaller resources in terms of IT infrastructure, IT security, all these things can be outsourced.”

Managing large amounts of data isn’t something every company is good at, either. “Not all customers have the IT expertise to manage the immense data volume being connected across the supply chain,” said David Park, vice president of marketing at PDF Solutions. “With a cloud-based solution, they can outsource all aspects of the data management and computation. They can just focus on using the analytics.”

Connecting data across the supply chain provides a strong argument to use cloud-based technology for analytics. The immensity of this connected data, and the perceived need to apply ever more complex analytic tools, strengthens this case.

“With the growing complexity of semiconductor manufacturing processes, simple statistical or heuristics-based algorithms alone are not enough. The amount of data collected during the process is increasing and the data types are not as simple as just voltage/power/frequency. A more flexible data model is needed. To answer that, we see a clear shift to deep data analytics powered by machine learning,” said proteanTecs’ Sever. “To address the siloed data, the industry is moving to cloud computing, which allows running complex and ever-evolving algorithms, and enables a scalable data model that can grow and allow versatility of data types. These include on-chip measurements, which serve as the basis for a common data language called Universal Chip Telemetry (UCT).”

New capabilities
Combining data across silos and throughout the semiconductor lifecycle is incredibly complex, which is why there is so much focus on using AI/machine learning algorithms to help make sense of all of this data and identify relationships and patterns. But it also is fostering new relationships between companies that historically existed in separate silos.

The recent investment by Advantest in PDF Solutions is a case in point. “The PDF Exensio platform enables better collaboration across the supply chain,” said Jeff David, vice president of AI solutions at PDF Solutions. “As a major ATE vendor, Advantest brings their cloud-powered technology, customer relationships and massive global footprint. We bring all that data together from all these different sources. Then, working with Advantest, we can deploy machine learning models and various use cases that require different time domain constraints. For test facilities these time constraints can be real time, near real time, or post process.”

Synopsys’ acquisition of Moortec, Siemens’ acquisition of UltraSoC, National Instruments’ purchase of OptimalPlus, and Ansys’ acquisition of Gear Design Solutions are all the result of similar trends to utilize data more effectively.

Slow adoption
Yet with all the potential benefit that a cloud computing solution offers, reluctance remains. Analytic companies point to their customers’ data security and data transmission concerns as the primary reasons.

“It is mostly related to data security and privacy and this makes perfect sense,” said Sever. “The manufacturing data, as well as quality and reliability information gathered during these processes, have direct impact on the company’s business aspects such as direct costs, warranties, and liabilities, and is one of their key sensitivities. Platform vendors need to understand and respect this, but at the same time build the data security and access control to alleviate these risks.”

Others agree. “The data is highly proprietary and encrypted, so how do you do connect it? I like to think of a highway analogy,” said Advantest’s Schaub. “The big data highways are not connected right now and are under construction. We are building those highway connectors. We don’t need to see the actual traffic (the data). We need to know the traffic capacity needed — how much traffic, how fast does it need to flow, and how will it evolve and grow over time.”

Until these highways are well constructed, engineers and engineering managers have valid concerns regarding transmitting large amounts of data across the globe. And it’s not simply a matter of security. The reliability and speed of the broadband connection also matters.

Not all factory locations are as equally well connected to transmit data from one end of the globe to another. Consider the numerous test/assembly facilities in Asia and the South Pacific. A 2019 analysis of Inter-Regional Bandwidth by TeleGeography reported that connections from the South Pacific region (a.k.a. Oceania) were significantly slower than those from Asia.

Even for the very high bandwidth connections, there remain valid concerns for data reliability due to last and first mile links in the data transmission.

“When transferring data between sites, consider how many network switches comprise the network connection,” said Onto’s McIntyre. “In general, it’s probably fine through most of that transfer. But either the last mile or the first mile is going to be constrained. So, when it comes to mission-critical data, which day in and day out almost instantaneously runs your factory, I really don’t see this data moving off of the premises.”

Others share similar observations regarding reliability of data transmissions and seemingly small delays that impact an engineer’s user experience with data analytic platform visualizations. A delay of 0.5 seconds on loading a wafer map from an Asian-based test facility by a product/quality engineer in Europe seems to matter.

Slowness of data connections with a cloud-based solution does motivate semiconductor companies to shift from a cloud to an on-premises solution. Rathei provided an example: “The customer had a cloud-based solution from an alternative vendor. Because of this customer’s growth, bandwidth issues resulted in this solution becoming slower and slower, to the point that it in effect it didn’t work anymore. We moved them to our platform with an on-premises solution, and now they have fast data access again.”

Other practical considerations
This certainly isn’t always the case. Those with high-speed connections can benefit from nearly unlimited processing capabilities of hyperscale data centers. But latency is a critical factor for certain actions. So is managing IT infrastructure costs. Which is better depends on the specific application, the company infrastructure, the location, and the specific tasks they are trying to address.

“In semiconductors, every millisecond, and even microsecond, are accounted for,” said Advantest’s Schaub. “During in-situ production testing, we need to manipulate a lot of data in real time, which means we also need extremely low-latency solutions that work seamlessly across the test supply chain.”

But whether that is on-premise or cloud can vary greatly. “With real-time critical systems like wafer lot disposition, latencies represent a killer factor,” said GlobalFoundries’ Partyka. “You can’t wait for the data on a remote compute system to load balance, turn on, and then disposition the lot. You need to be on-premises and reacting right way.”

On the other hand, a cloud computing solution may make sense for leveraging machine learning predictive models, which engineers want to update with the latest data. And IT data management engineers find the flexibility of the cloud’s on-demand solution extremely attractive to support surge computing needs.

This comes with a caveat, though. Industry analytic users and platform providers note that cloud technology costs can balloon if not closely managed.

“The cloud fee structure has pros and cons,” said Sunil Narayanan, senior director of applied intelligence solutions at GlobalFoundries. “You have the ability to scale up and scale down on very short notice. It’s almost like a 10-minute thing. It auto-scales with the computational demands. The con is if you don’t have a good governance model, you may get surprised by the bill at the end of the month.”

Stacy Ajouri, senior member technical staff at Texas Instruments, noted a similar concern. “We have to be careful with our usage because the cloud can become exorbitantly expensive. We can use quite a lot of data for an analysis. If you don’t calculate properly with your use model of data and analysis, the costs add up quickly.”

Conclusion
Cloud computing solutions make the most sense when connecting data across multiple sources along a supply chain. It’s not always the most cost-effective solution, particularly when data transmission and storage costs are included in the picture. In addition, not all data-driven decisions in a factory require a cloud-supported solution analytics solution.

But interest in the cloud is growing. How much will be on-premises versus cloud-supported data compute/storage solutions in the future is unknown, and it may be difficult to arrive at a number because solutions are likely to be a mix of both. Nevertheless, this still represents a big shift for the chip industry as it comes to grips with the need to break down data silos and better utilize that data across the semiconductor supply chain.

Susan Rambo contributed to this story.

Related Stories

Infrastructure Impacts Data Analytics

Too Much Fab And Test Data, Low Utilization

Data Issues Mount In Chip Manufacturing



2 comments

joe says:

Isn’t Edge computing better suited for chip manufacturing than Cloud?

Anne Meixner says:

Joe,

Cloud or on-premises in this article refers to where do you store and compute all this data to build models and make decisions. Your point is well taken that decisions within a chip manufacturing facility should be made locally and that the computation of those decisions should be made at the edge.

One can use computing facilities to support the scientific computing of large amounts of data with your algorithm of choice to build a predictive model to determine is this a good or defective chip.
See this article for an overview of adaptive test methodology:
https://semiengineering.com/adaptive-test-gains-ground/

Leave a Reply


(Note: This name will be displayed publicly)