Big Challenges In Verifying Cyber-Physical Systems

Experts at the Table: Models and standards are rare and insufficient, making it difficult to account for hardware-software and system-level interactions and physical effects.


Semiconductor Engineering sat down to discuss cyber-physical systems and how to verify them with Jean-Marie Brunet, senior director for the Emulation Division at Siemens EDA; Frank Schirrmeister, senior group director for solution marketing at Cadence; Maurizio Griva, R&D Manager at Reply; and Laurent Maillet-Contoz, system and architect specialist at STMicroelectronics. This discussion was held at the recent Design Automation and Test In Europe (DATE) conference.

SE: What are cyber-physical systems?

Schirrmeister: The accepted definition is, ‘It’s a computer system in which a mechanism is controlled or monitored by computer based algorithms. So physical and software components are deeply intertwined, and it’s able to operate on different spatial and temporal scales, exhibit multiple district behavioral modalities and interact with each other in ways that change with context. Examples include smart grid, automotive, autonomous automotive systems, medical/industrial robotics, and automated pilot.’ So it’s really way beyond electronics, in an area we refer to as computational software. Hardware/software was a topic in the last decade, where we all worried about how hardware and software interact. Now you add in these physical elements, electromagnetic effects, fluidity effects, thermal effects. So it’s going much wider.

Griva: Sensors are revolutionizing the electronic design that we’ve had for the past 20 years. Having a part of the computer that is dedicated to capturing information from the physical world in real-time, and acting in real-time, completely changes the rules of the game. You are designing an electronic system that isn’t just a sensor or a computer or a microcontroller. It’s all these things together, plus inference and power control needed for transmitting data out to other systems. CPS has a broad definition by itself, but it’s pushing a revolution in terms of electronic design.

Brunet: We refer to CPS as a digital twin. This is particularly important in certain verticals, such as automotive and industrial, and we see a number of different domains. One is the sensor world, where you need to sense. You need to be able to capture data, either by vision or any type of sensor, and to compute locally what you sense. This is raw fusion of data that has been computed with a certain type of algorithm, and as a consequence there is activity that needs to take place. It could be wireless or wired. What is interesting with digital twins is it helps with maintenance and support. You want to design digitally, with a very advanced model, what is happening physically. But it also helps to track through your remote connection when something is happening on that system, so we are able to preemptively create a maintenance report before a fundamental problem can occur. This notion of sensing, computing and activating, based on very specific advanced models with different levels of fidelity, is something that we see just starting as an industry.

Maillet-Contoz: I agree that CPS is basically composed of sensors, actuators, some local computation capabilities with microcontrollers, and communication. A full CPS system is not only a node. It is a collection of nodes that work together to develop and provide a system functionality. They have their multiple functional properties, they are low energy, they might have cellular communication, and the full system must be resilient. So we need to ensure the full functionality of the system, and this has a deep impact on the way systems are designed and maintained.

SE: Is this a merging of what we used to think of as the IoT and the edge, where now we’re moving from rather dumb devices that could communicate to more processing inside these devices — and potentially adding mobility?

Schirrmeister: That is one part of it. I would not call it IoT. The Internet of Things has always been problematic from a naming perspective. Some people think of it as its own vertical. But it does have the characteristic of a large number of sensors being connected. Creating this data really goes through all the verticals, from consumer devices to hyperscale compute engines to mobile networking, auto, industrial and health. But it’s not only one component. It’s a whole network of connected those items. There are different types of representations throughout their lifecycle. There are different types of digital twins for design and development, for manufacturing, and for predictive maintenance, which is throughout the product lifecycle, and even for recycling and scrapping. That’s one dimension, which is where in the product lifecycle you apply that cyber physical system and how you connect all of those. You have to decide how much processing to do at the device, at the sensor, or at the far edge, and what do you move into the data center. There’s a lot of wiggle room from an architecture perspective.

SE: We’re now dealing with electronic systems that are in motion. We’re also starting to add in things like AI and machine learning and deep learning into these devices. This is a very complicated design problem, because you no longer can take a divide-and-conquer approach. We now have to think about systems, how they behave over time, and how they interact with other systems. How does this affect the design process?

Maillet-Contoz: We need to consider the layers in a system. In a typical IoT system, we will have individual nodes, and those nodes will communicate with each other to create a full system. So when it deals with digital twins, we might consider various levels. You might think of a very abstract model, where each node is just a functional block that provides a certain capability to the whole system. But for such an abstract model, you will never be able to validate the hardware/software integration. So you can also think of another kind of model, where you would have a model of the architecture of each node, to be able to implement and validate the hardware and the software parts of the system. And you might compose all these various models and advanced levels of abstraction, but this also might have an impact on global system performance for simulation. You need to make the right modeling choices, and that depends on the use case you target for your models. I suspect that we also need to revisit the way we think about the models, targeting various levels of accuracy that will depend on what you want to focus on, and what you want to validate with your models.

Brunet: I agree with that. We see this as a complex modeling problem. You need the right model with the right fidelity, the right accessibility and usability. Verification of the system is difficult when it comes to the interaction between software and hardware. It seems like software now defines the behavior and the success of a semiconductor. It’s changing the way chips are being verified. They really are based on software performance and analysis, and that interaction between the software and the hardware is key. So how do you verify this? We see the verification of CPS or digital twins as a big challenge and a key opportunity for us. The amount of local computation on some devices today is astonishing. Verifying a chip design within the context of a digital twin for a car or medical devices starts with the software, and how the hardware is going to interact with the software. So we see a huge need for system verification and validation, in addition to the semiconductor within the context of the software.

Schirrmeister: And it’s adding new problems from an industrial perspective. The software/hardware aspect is one piece of it. But then you also have all the physical data. With effects like thermal in that same context, it becomes a big issue. I don’t think the hardware/software problem is solved, by any means. But now we have to take into account all of the physical aspects, too. This is why, from a market perspective, we now are talking to people who have mechanical and systems backgrounds.

Griva: That’s absolutely correct. There’s much more interaction in the early phases between software/hardware design, as well as mechanics, than a few years ago. In particular, we see a strong need for knowledge about batteries in wireless, because one affects the other. The duration of the battery for a self-sustaining IoT device is critical. Also, the software that runs, and which is able to switch on and off and put it in a deep, deep sleep mode, impacts performance and the drain from batteries. So how can you put all these things together from the beginning? It’s really, really tough. And so you end up having a hardware design much earlier. In the past, you would have a round of three or four prototypes, and the fourth prototype — what we call P4 — was the last one before going into mass production. Now maybe you have P10, because things cannot be tested in the early phases. And I’m really scared about AI algorithms going into that and executing programs in real-time from one node to another, because this complexity of systems can only be tested in the very late phases of the overall design phase. Unfortunately, that also can impact the choice of the components and even the microprocessor in the early phases. So we need to find a way of modeling these kinds of things much better. Otherwise, the test and validation efforts in the different phases will explode, and the cost of the solution will explode, and it will a negative impact on the stability and availability of those devices.

Schirrmeister: A key question is at which level of abstraction you are modeling these items. Philippe Magarshack [vice president of the microcontrollers and digital ICs group at STMicroelectronics] talked about management in the fabs, where they had 3 terabytes of data being generated and thousands of robots doing tens of thousands of paths per day. You cannot model that by going down to the current of the motor in the robot with all the low-level software. So you will have hybrid abstractions, with some elements being modeled at a higher level in full detail. That’s a whole gamut of new options for how you put together your execution environment.

Brunet: The models are not available all the time. We come from a semiconductor background where views and models have to be available. Otherwise, the rest of the flow cannot run. With industries like automotive or medical, sometimes the model doesn’t even exist. So there is a big need in the market right now for advanced modeling, and ensuring the accuracy and stability of the model. Sometimes we talk to customers in certain verticals and they don’t even understand RTL, what the model is, or how to create one. Access to models, and availability, are problematic.

Maillet-Contoz: Another challenge is that the value chain and the communities that shed light on ways to model CPS do not use the same tools. Some industries never have used the kind of tools we do in the semiconductor industry. We also need to consider that the same system might be modeled using values tools and values paradigms, and we need to ensure there is global consistency between all these layers of the models used in the same device or same set of devices. There is also a need to hook these models together so you can simulate in a kind of hybrid mode with various levels of accuracy for value simulation patterns. And depending on the target usage, what are the requirements for the people who are using these models? And how can we serve the community to ensure that the models will serve the purpose with the best simulation performance we can reach, but also provide the right level of accuracy so that people can get the value from the models.

Brunet: That’s very important, because when you move to the OEM level in the car market, for example, there is not a single provider. They have models that are executed by different types of software that are provided by many different providers. And if the solution is to try to look at the system and say everything has to be mobile with my own tool, like closed flows that we have seen in the semiconductor space, it doesn’t work in this space. You really need to be able to integrate all sorts of models, all sorts of different tools that are sometimes competitive in the EDA space. CPS is pushing a lot of providers like us to integrate with many different types of models. It doesn’t matter if it runs better with our tools. We need to be able to integrate into a total system view. So that level of ecosystem integration is far more important than we have seen in the past.

SE: One of the challenges is that in the past, we would design chips and they would go into smartphones or computers, and we’d have billions of units of the same design. What we’re seeing now is chips developed for CPS can vary for individual markets, but they’re also very different from one vendor to another. On top of that, we need to understand how these devices age and how they interact over time. How do we move the ball forward as an industry?

Maillet-Contoz: Part of the answer is to define the right interfaces. So we certainly need to make huge progress on standards to be able to better integrate value simulation models together. If a carmaker wants to integrate models for all the components, then they will need to integrate all this in in a seamless way. Otherwise, I don’t see any possibility to get a full system integrated.

Schirrmeister: Standards is a very big item. During a panel discussion a couple of years back, an automotive OEM turned to a Tier 1 supplier and said, ‘I have different semiconductor suppliers and their models don’t work with each other.’ And then all of them looked at the EDA provider and said, ‘Oh, your tool doesn’t work because it doesn’t bring all those models together properly.’ So that’s a standardization challenge. But the problem goes beyond standardization. It’s also about new levels of knowledge sharing. I grew up as a software engineer doing hardware design, but now I have to deal with mechanical issues, too. Maybe a mechanical effect needs to be modeled in the context of a motor, which involves new levels of interfaces at different levels of abstraction. But some of these interfaces are not there yet, or at least they’re not that commonly known. So standardization across all offerings becomes a big issue. And then, some level of cross-functional education is necessary because people need to at least understand and appreciate the other challenges and needs. So it’s not only that hardware developers are from Mars and software developers are from Venus, or whatever planet you want to assign them to. Now you have new classes of engineers from other planets, who have different ways of talking about things. And they all need to be brought together in this context of a cyber-physical system from a modeling perspective and a verification perspective.

Brunet: There are some types of standardization that will do well. FMI (Functional Mock-up Interface), for example, is an appropriate standard. Transaction-Level Modeling (TLM) is useful where the interface is relatively clean. And it doesn’t matter if the model is run by a tool from a competitor or not. So I agree some standardization effort is necessary, but I also think this is happening. We’re not reaching a point where the market is saying, ‘There is no way to communicate your standard between different tools and different modes.’

Griva: The topic of interfaces is definitely at the core of everything. Hopefully we do not have so many different devices, and that we try to re-use as much as possible from existing models and upgrade them as soon as we define one of them. But for interfacing between engineering sections of manufacturers and contract manufacturers, standards are extremely important for them. If we have a variety of products that we have to produce and sell, then the contract manufacturer that is manufacturing the end product will have a multitude of customers. Standards are important at that level. Having a proper interface, because the manufacturer pushes for a change in the design to bring down the cost and the speed of manufacturing, is absolutely normal. Therefore, we have to break down our design as much as possible. It has to be flexible for substitutions, because we are making new products or because our customers are asking for a specific sub-module or sub-function to remain competitive. This is something we have to take into consideration when we do a model at the early stages.

Design Issues For Chips Over Longer Lifetimes (Part 2 of above Roundtable)
Keeping systems running for decades can cause issues ranging from compatibility and completeness of updates to unexpected security holes.
Longer Chip Lifecycles Increase Security Threat
Updates can change everything, whether it’s a system or something connected to that system.

Leave a Reply

(Note: This name will be displayed publicly)