Verification And The IoT

Experts at the Table, part 1: Application-specific verification, and why quality may vary from one market to the next; why different models are ready at different times.

popularity

Semiconductor Engineering sat down to discuss what impact the IoT will have on the design cycle, with Christopher Lawless, director of external customer acceleration in Intel‘s Software Services Group; David Lacey, design and verification technologist at Hewlett Packard Enterprise; Jim Hogan, managing partner at Vista Ventures; Frank Schirrmeister, senior group director for product management in the System & Verification Group at Cadence. What follows are excerpts of that conversation.

SE: How big is your verification problem and how is it changing?

Lacey: It depends on the project. Some of our chips are very large node-controller chips that go into our high-end servers. Those take years to develop, and the challenges are very unique. And then we have much more power-conscious designs that may go into a media controller. Those bring their own set of requirements. We adapt for each of our projects the specific requirements of those particular chips we’re working on. Schedule and resources are other constraints that we pull in. But as we look at all of those pieces, we do try to keep consistent verification methodologies. That enables us to re-use our IP or VIP across projects. We have a substantial infrastructure that enables us to share code across projects. Our verification problem is large. The cost of silicon is increasing dramatically as you go to smaller nodes.

Lawless: Our verification problem is compounded by the schedule pressures we’re under and the expectations of getting out these processors on a faster cadence. We have a variety of products, ranging from servers, which require a certain level of verification, all the way down to IoT devices. Our challenge is applying the right levels of verification to get the right levels of quality in this array of devices. There are very different timelines and timetables and margins and costs involved with this variety of devices.

Schirrmeister: Referring back to a presentation late last year from MediaTek on the different requirements for verification and the priorities, what the speaker concluded was that the priorities go well beyond just performance, power and area. He added in-field upgradeability to the list. If you’re in a big infrastructure project with 1 million edge nodes, you can’t send out someone to switch out all the bulbs and do all the upgrades. With medical IoT, industrial IoT and security and safety are crucial. Those all change the verification flow. It’s going way beyond PPA, adding a slew of other priorities. There are commonalities across flows, but we are working with specific flows. So you need PCI Express virtual and real interfaces in the server domain. From a vendor perspective it’s a challenge to carve out the commonality of what everyone uses and the application specificity of the flows.

Hogan: From my perspective, it’s about what am I going to invest in. What interests me are edge-based devices and autonomous devices that you bring to the edge. That includes traditional verification as well as verification of devices that are capable of saving a lot of energy. That might include analog CNNs. How are you going to verify that? How are you going to make sure that is reliable in time? Quality and reliability over time play a role in safety, and they will play a role elsewhere. And how much is it going to cost? Verification right now is an unchecked cost. It’s going to make it more expensive to bring innovation to market. If that’s the case, it may prohibit us from bringing out products that we dream up. There are a lot of questions about the verification, not only what we have today but these low-power products that are going to be very application-specific.

SE: It’s everything we were doing before, plus we have all of these new things. What does that do the verification cycle in terms of coverage and time to market?

Lawless: One aspect is mixed signal. We do a really good verifying, using a variety of capabilities and tools, especially pre-silicon in digital. But as we see more of a mixed-signal focus in these devices, bringing in a lot of analog components, it becomes a much bigger challenge to get that done early in the process. If you have to wait for silicon to do a lot of that work, you’ve wasted a lot of valuable time. Trying to pull that in and have it all complete and verified when silicon arrives is really the key. We haven’t cracked that nut yet, but I’m hoping we can solve it.

Lacey: We often have challenges as to when our detailed analog models are showing up. We invest in hybrid models that enable us to get the full ecosystem of our validation environment and test in place so that we get some view of our digital logic and how it’s going to interface with what we believe that analog logic is going to be. Then, when the real models show up, we can slip them in. In that case we’re spending more engineering time, but we’re saving schedule, which is our goal in the end.

Hogan: I would imagine that’s around behavioral models for analog?

Lacey: Absolutely. We do try to create those hybrid models so they are as reusable as possible from generation to generation.

Schirrmeister: Models are crucial at different levels. One of the questions in the IoT is how you simulate and emulate it all together. There can be multiple chips together connected by analog. We have users doing that in simulation and emulation, connecting multiple pieces of silicon in their model together with the analog mixed signal around it. It is an interesting challenge from the standpoint of an IP provider, too, because there is a question of who does all of those models. It’s not easy to maintain them all synchronously at all levels.

SE: We’re dealing with new concepts here. There are advanced chips at 10/7nm. We’re potentially adding photonics. There also is new packaging. This stuff has never been in the market before, and some of is supposed to last for 10 years or more. Can we make this work?

Hogan: The obvious one is cars. The average life of cars in North America is about 20 years. Think about how many things change in 20 years. How can you reliably predict today what you should verify for 20 years from now? There has to be a way to do this in the field. Today, if you park a Tesla in the garage, the next day it may have a new interface. That’s going to be the norm. Things not only have to be verified for the initial distribution, but they also will have to be verifiable over time. For the digital side we understand how to do that. But how do you do that on the analog side? That’s going to be very difficult because you’re probably not going to change the analog much. It’s going to be programmed one time to be shipped a million times for a specific application. How do you deal with that problem over time? That’s the reliability and safety problem.

Schirrmeister: This is the whole idea of DFT (design for test), with test at the end, where we have a DFT process for safety. Functional safety is a huge issue if you’re doing an ADAS or infotainment design. ISO 26262 is suddenly very important. It has become an issue of which application domain requires which standards to be fulfilled, and that’s how the flows have to adjust and merge.

Hogan: You mean like application-specific verification?

Schirrmeister: Correct.

Lacey: Quality really depends on the product. A lot of times our architecture will incorporate features that help increase our product quality. With our high-end servers, customers expect huge uptime. They cannot go down. It’s very important that we build in capabilities so that if errors occur in memory, they will be corrected. Or if a technician pulls the wrong cable during a maintenance cycle, we need to be able to route around that. If it’s along the line of safety, we’re developing technology around fiber optics to increase our bandwidth and lower the power footprint. But there are safety aspects that come with the use of lasers. There is a lot of logic we’re developing around those devices to make sure they are safe to use in the general environment.

Fig. 1: Uptime matters. Source: HPE

SE: Servers used to be swapped out every two to four years. Is that still the same?

Lacey: Budgets continue to put stress on companies. They want to use their equipment longer and longer. So we’re seeing equipment being used for longer periods of times these days.

SE: So how do we build chips that last longer?

Lawless: The chips are getting more complex. There’s a tradeoff between synthetic testing and all the random testing, which really can go on in perpetuity. To achieve that quality, you really have to understand where it’s going, what it’s going to be used for, and take an outside-in approach where you focus on those sets of use cases for how it is being used. You need to understand the software that will run on top of that device. You need to narrow the problem a bit—you’ll never narrow it fully—so you can really focus on validating within that context. That solves several issues. You can get it out to market, and you can do more in-depth testing to make sure it works properly.

SE: Software is an interesting part of this discussion. Software gets updated so frequently that a lot of the concerns about safety and security are rolling back into the hardware. How do we verify everything will work and still make sure that we can keep systems current with software updates?

Hogan: This is where application-specific verification comes in. If you think about a smart phone, what companies do is take the chip they’re developing and run all the software against that, find errors, fix them, and make sure it works over time. Phones don’t have long lives—maybe three years—so in three years there may be changes. That allows companies to bring out chips every nine months. But if you’re building chips for a variety of markets, that’s a much different problem. You have to build application-specific verification, and the more you can narrow it, the better off you are. The problem is that becomes expensive unless you have high volume, so that’s going to be a challenge.

Lacey: There are certainly security issues in terms of updates. We deliver thousands of servers that people deploy in their data centers. The management and upgradeability, and how you ensure only your firmware is running on them and not some rogue firmware, is a critical piece. We have custom management ASICs that allow customers to control that. There are security features built into those chips to make sure only the pieces of software we trust will run on our servers. Security is a growing aspect of our development, even just on the server market—and that’s before you get into the IoT space.

Schirrmeister: We enable the software updates. Often you want to do this in software, but that’s often the point of attack—when a software upgrade happens and you get the wrong upgrade. It’s a big concern for end consumers. From a vendor perspective, we’re trying to enable security with things like assertions. What is it you don’t know about your chip? It’s increasingly difficult to define the scenarios for what the chip is supposed to do, but more importantly what is the chip not supposed to do? And how does it react if it is outside its normal operation.

Related Stories
Verification And The IoT (Part 2)
What is good enough, and when do you know you’re there?
Rethinking Verification For Cars
Second of two parts: Why economies of scale don’t work in safety-critical markets.
2017: Tool And Methodology Shifts (Part 2)
System definition to drive tool development, with big changes expected in functional verification.
System-Level Verification Tackles New Role (Part 2)
Panelists discuss mixed requirements for different types of systems, model discontinuity and the needs for common stimulus and debug.
Grappling With IoT Security
Updating connected devices creates a whole new challenge as threats continue to evolve.