Experts at the Table: With both established companies and new players clamoring to play a role in the automotive space, how is the industry moving towards automation? (Part One)
Semiconductor Engineering sat down to discuss functional safety thinking, techniques and approaches to automation with Mike Stellfox, Fellow at Cadence; Bryan Ramirez, strategic marketing manager at Mentor, a Siemens Business; Jörg Grosse, product manager for functional safety at OneSpin Solutions; and Marc Serughetti, senior director of product marketing for automotive verification solutions at Synopsys. What follows are excerpts of that conversation.
L-R: Stellfox, Ramirez, Grosse, Serughetti.
SE: As the industry evolves from manual approaches to functional safety to more automated and theoretically less risky approaches, what challenges does that bring?
Grosse: We want to automate the FMEDA [Failure modes, effects, and diagnostic analysis] process as much as possible. My personal opinion is that you can’t have an FMEDA tool. I don’t think that exists. You have to have several building blocks. You have to have a part for the hardware safety analysis, a part for dealing with the diagnostic coverage analysis determination, and also the part of the computational model. That’s where we are looking into it and automating this process.
Stellfox: I have a team in Cadence that is putting together a functional safety solution that’s primarily focused around FMEDA verification. We are trying to automate as much of the process as possible. Of course, functional safety has been around for a long time, but in the semiconductor industry, especially in large scale complex applications like ADAS, it’s very new and there’s not a lot of maturity in the industry in terms of how to do really robust efficient safety verification. There’s a big opportunity starting with FMEDA to automate a lot of the pieces, but one of the biggest things I still see is the need to develop more experts in our semiconductor industry. That’s probably the biggest challenge.
Serughetti: Functional safety is something that has become extremely important to a lot of our customers. We need to go back to the drivers—to why people are looking at this. Obviously for complex chips, a lot of functional safety has been done in the past on much simpler chips so that’s driving this. I think there are two parts to functional safety. The first one is expertise. I agree there is a huge lack of expertise. Today we know of customers hiring people with just functional safety expertise, and no semiconductor background at all, so there’s a big lack of expertise. The second part is the automation, which is very important. It’s not a one-tool thing. It’s a series of things that can be automated, and there’s a huge opportunity for automation on a lot of those activities, especially in the verification area. Also, you have to think about what that effects throughout the supply chain because functional safety is not a one company issue. IP, SoCs, the customers of those SoCs and things like this. There’s also a changing ecosystem in that space.
Ramirez: On top of what the others said, an area that’s interesting going forward is that while automating is great, we need to do it—especially to deal with the increased capacity and the lack of expertise. But how do you merge that with the need to still have expert judgment? Are you ever going to be able to get away from not having to have those experts? Trying to figure out how those two worlds merge most effectively to achieve the end goal is pretty interesting going forward.
SE: What pieces of FMEDA can be automated, and what can’t?
Grosse: You can’t automate the analysis process. You have to have engineers in there to do the hardware safety analysis part of it. Once that’s done efficiently, then you can use the model of the IC to automate lots of it, especially all the computational part can be highly automated. We believe we should get away from having spreadsheets, which do not enable a repeatable process, things like that. This is where we as EDA can come in and help a lot by having the IC model at hand and extracting all the information we need for that. And on top of it, obviously doing the verification part of it.
Serughetti: There are really three aspects. Expertise, first. Then, verification, and the verification can be automated. The third part is the process management. It would be so simple today if you didn’t need to have this, but today customers are saying, ‘My guy on this side of the company doesn’t talk to the guy over there. The information he has, he understands them differently.’ So there’s a process management aspect that can be automated, as well.
SE: How does the expert judgment of the functional safety experts manifest today?
Grosse: There’s probably a bit of confusion to that because expert judgment can go a long way. It can start from basically doing just the paper analysis and looking at the standard and what would that safety mechanism give me. But there could be also things like more qualified expert judgment where I’m actually looking at the design itself, making sure that my judgment is correct and partially do some verification. The term is a bit unclear. Some people might understand, others are just writing down a number and that’s it. Or you do more qualified expert judgment. I would say the qualified expert judgment is, ‘This where we need to go to.’
Ramirez: I agree with that. We call that validating the assumptions that the experts do. You need to do that hand analysis because you need to do it before there’s any design. But as soon as you can validate those numbers before you get to an expensive fault campaign or whatever, you want to do these iterations sooner. The sooner you can iterate that and figure out if your safety architecture is correct, the better.
Serughetti: What’s also changing in expertise is it’s not a core semiconductor expertise only. You need to understand where your SoC is going to be used. You have to start bringing expertise of understanding the customer. And that’s also a change for a lot of people’s competency. Yes, they understood, but they never tracked that aspect as well.
Stellfox: Use cases are very key. Some of the systems companies are actually designing their own chips, and then they kind of have the advantage that they have the use-case knowledge well understood, even if it may be separate groups. But if you’re developing a general chip as a semiconductor company that you need to deliver, the use cases are not something that you can only look at after the fact. You have to really factor that in up front for safety, because the use cases really determine whether or not the way you designed the safety architecture could be used in a safe way.
SE: In addition to use cases, how else do we develop experts in functional safety within EDA?
Grosse: Get the safety experts and the verification experts and lock them up in a room together for a week and demand they start talking to each other, because this is where there is a huge barrier. Safety engineers understand the FMEDA process very well, but they don’t understand design. And you need to be very careful to whom you’re talking to and which language you’re using.
Stellfox: That’s exactly right for my team. I have safety experts and I’ve put them together with some verification experts, and they are cross-pollinating because they both have their strengths. What I’m typically seeing in our customer base is there’s not that many safety engineers, and so the verification engineers are having to expand their scope. You need to be able to translate or talk to things in a terminology and way that the verification engineers understand, and verification engineers have the right kind of background to expand into this space. The concepts, the abstractions are things they are thinking about. We talk about it in a simple way. Functional verification is about verifying how the design works in a positive way, and safety verification at an abstract level is negative testing, i.e., is it going to work under those conditions? And that’s very much the way verification engineers tend to think about things even though they don’t have all the safety concept expertise.
Serughetti: You should look at the market. You already have the people that have done functional safety for years, and those people understand functional safety. Their challenge is more about, ‘My next design is so much bigger, what do I do?’ And then you have a slew of people that suddenly want to come in the market, and here it’s a completely different story with those people. It’s, ‘Why is it not the same as functional verification?’ You have a different type of education you need to do in those cases. But I agree with functional verification engineers. There is a way to bring them to functional safety, but there’s a lot of education that needs to happen for them to understand what is the difference, and what they are trying to do differently because underlying this, they use a lot of similar technologies. It’s understanding what you are trying to achieve, why it is different, and what guides you through that.
Ramirez: It’s sort of synonymous to when OVM and UVM came out. Getting verification engineers that wrote Verilog to do object oriented was a real challenge. And there was always a lack of experts in that space. And so oftentimes people augmented by hiring C++ experts and trying to teach them verification. We’re going through a similar phase here, where you’re going to augment with non-ideal people. But it’s going to take time and merge the thinking together.
SE: So it’s not a problem that can’t be solved as far as the concepts themselves, right? It’s not like a totally different world. The verification engineers can be brought up to speed.
Stellfox: It’s still a pretty big learning curve. Like myself, I knew nothing about safety three years ago and I’m far from an expert now, but it’s a huge number of things. It’s tools, methodology analysis, and I see a huge opportunity for innovation in the space, especially by EDA to pull those things together and make it more systematic at the semiconductor level. It’s systematic, I would say, at a system level, although even there with some of the new ADAS things coming, there are whole companies that are trying to do the same thing at the system level—verifying cars versus chips.
SE: What are some of the biggest things that you’ve picked up over the past three years that are different from what you did before? Is it really a change in mindset the way that you approach these issues? How would you qualify or classify what the thinking is for functional safety compared to verification?
Stellfox: Most people start and say, ‘I just need to run a bunch of fault simulation.’ That’s the number one thing. Or, ‘We’re going to have to run fault simulation on our entire chip.’ That’s the first thing everybody jumps on. The first thing is just understanding what you’re trying to achieve. It’s actually not that complicated. You have safety mechanisms that are intended to catch random faults, and you need to devise a methodology by which you define the ways the device can fail. For verification engineers, that’s a very natural process because we do verification planning. This is just expanding verification planning to thinking about how the device could fail, and then making sure that you have safety mechanisms to capture that. Once you put the analogies together, it’s very much the same kind of skills that verification engineers have. The hardest thing is in capturing the FMEDA. Just like a good verification engineer, the hardest thing to teach is how to do good verification planning, that’s the key part of it. How do you decompose the problem? And how do you actually come up with the high level of things that you actually need to plan for in the FMEDA?
Serughetti: Safety architecture, safety architect, understanding where the failure could appear. What are the safety mechanisms? That’s the big part of front. That’s the expertise.
Grosse: I believe the analogy is okay with the verification planning, but there is a level of additional expertise required to do that rather than just thinking about the safety mechanisms. You have scenarios where there are layered safety mechanisms, which makes it really complex. Then you have to throw the dependent failure analysis into the mix. I liked it. It’s the right way of thinking forward. But there is more than turning your verification planning tool into an FMEDA analysis.
SE: What comes next?
Grosse: One way ties into that. We have a very hierarchical thinking of the chip, but that chip hierarchy might not map exactly to the safety hierarchy.
SE: What does that mean as far as being able to qualify something as an automotive grade chip?
Grosse: When you divide the chip into parts and subparts, according to the standard, if you divide your chip according to the safety architecture, that might not exactly map to what you have in your tool as the hierarchy tree of the chip. So you might have a block where you have maybe two safety mechanisms in there, or you might have safety mechanisms which cover several blocks, or just part of the block. We need a slightly different thinking there.
SE: What’s the thinking?
Grosse: The thinking is looking at it from the safety architecture point of view.
SE: How do we define the safety architecture?
Grosse: That’s driven by the requirements, by the safety goals, and you have to decide what goes on chip. It comes from the use cases basically. It starts with, ‘Do I have an ASIL-B or do I have an ASIL-D device?’ Then I know what safety mechanisms to put into the system. But the analysis and the verification part should be very centric to the safety mechanisms I defined. That is also allowing us to be more effective in terms of what we have to simulate, and what we don’t have to simulate.
Serughetti: At the end of the day, the verification part is about verifying that the safety mechanisms do what they are supposed to do. That has to be the focus. You cannot go and say, ‘I have to do everything everywhere.’ You have to be smart about that, and that starts at the analysis.
Stellfox: And again, it starts with, ‘What is it?’ Just think from the system level, you’re designing a car and then there’s some subsystem in the car that depends on its logic from a chip. So you need to think about what could this chip functionally do? It’s supposed to do functionally for this part of the car that, if it fails in a random way, would that effect cause some unsafe risk of life? It all starts from the use cases. You have to really understand that to come up with a good safety architecture for your chip. It really needs to be top down, and the bigger challenges are these guys who are coming into the market from mobile or other places and trying to take these big compute intensive chips they have and say, ‘Oh, let’s just make that an ADAS server engine.’ It really needs to be thought of top down because you have a car system, what are the use cases, and then you build a safety architecture for that. Then you have to verify the safety architecture by injecting faults and making sure the safety mechanisms detect those things that would flow back up to cause the chip to fail in a way that it has to go into some safe state.
Serughetti: It’s like any design that you do. You have to spend the time upfront to capture properly what you’re doing otherwise you’re going to spend a lot of time verifying for nothing. That’s the key. And the challenge here is the expertise again, because that’s where the expertise comes into picture and that expertise may be a combined expertise with a customer. Now, nobody does a chip for ‘a’ customer, so there needs to be some expertise within the semiconductor company, but that has to link with the expertise that exists in the systems companies.
Grosse: The reality is, I would say probably 70% of these designs are such that, ‘I have a device and how do I make it safe?’ That’s the reality because all the semiconductor companies that want to move into that space already have existing designs. Maybe they have some safety mechanisms in there already, things like ECC and parity are common concepts, but what else do they need to put in there? We have to deal with a bottom up approach, as well.
Serughetti: That’s the traditional approach—I have something, I want to make it safe, and then a generation in the future will approach this more from a design perspective.
SE: What does that do then to the automation aspect of creating a safety architecture? How do you handle existing approaches?
Ramirez: There are opportunities to help the user figure out what the right safety architecture is, and let them do it in a way where they’re iterating earlier in the process, not having to go through this expensive cycle. Anything we can do as EDA to help them figure out the right combination of safety mechanisms to get the ASIL level they’re going after while optimizing for power, performance, area — that’s one thing. There are things we can be doing to automatically create those safe designs to alleviate the fact that there’s not a plethora of engineers who know safety, who know these legacy IPs, who need to convert them to be safe. That’s another thing we can do. And then just proving that they are safe through high-performance fault injection or whatever is, is it going to help close that loop?
Serughetti: We talked a lot about verification here, but the implementation aspect plays a role. Third-party IP—the use of third-party IP that the IP vendors are delivering now. IP that are certified ISO 26262. There’s a lot of pieces that can come together for that.
Related Stories
BiST Grows Up In Automotive
Existing test concepts are being leveraged in new ways to meet stringent automotive requirements.
Optimization Challenges For Safety And Security
The road to optimized tradeoff automation is long. Changing attributes along the way can make it even more difficult.
The Long And Detailed Road To Automotive Compliance
Bringing an engineering organization up to speed with automotive safety standards is a long and arduous process.
How To Build An Automotive Chip
Changing standards, stringent requirements and a mix of expertise make this a tough market to crack.
Leave a Reply