The industry is evolving, with new players, new problems, and new challenges. Is this why verification appears to be struggling?
Semiconductor Engineering sat down to discuss differences between hardware and software verification and changes and challenges facing the chip industry, with Larry Lapides, vice president of sales for Imperas Software; Mike Thompson, director of engineering for the verification task group at OpenHW; Paul Graykowski, technical marketing manager for Arteris IP; Shantanu Ganguly, vice president of product marketing at Cadence; and Mark Olen, director of product management at Siemens EDA. What follows are excerpts of that conversation. Part 1 of this discussion is here.
SE: We are seeing new entrants in the silicon development market. Is it possible that the decrease in first-time silicon success is related to this?
Ganguly: If somebody is coming from the software world, they inherently do not understand that doing a hardware model is expensive. If I’m building a software program, I compile and link. It takes minutes.
Graykowski: Welcome to my world.
Ganguly: There are two fundamental things that somebody from a software engineering background doesn’t understand. First, when doing a physical build using synthesis to the point where you have a model that you can run timing on, it’s a matter of days or weeks, not minutes. The cost of fabrication, the time taken, the cost of a re-spin, people don’t get that. And then the second piece is the logistics. You can’t FTP hardware. I can build a software product and put it on the website, and people can download that and think they can fix bugs, but it doesn’t work that way. This is extremely tedious, and the latencies are days, weeks, months. It’s a different paradigm. That’s why people spend so much money on verification, so they don’t go through these loops.
Lapides: When we think about open source, it’s a little bit broader than just the RISC-V community. Because of the freedom to add custom instructions in RISC-V, you are seeing more system people designing their own processor. They’re coming from the system and software world into hardware, and they are naive. They are trying to cut corners, and they are thinking, I don’t need tools from the Big Three. I’m going to use the open-source tools. They don’t have the scars. They haven’t gone through the pain.
Thompson: We shouldn’t be too smug about it because we were all there — but a long time ago.
Ganguly: There are ramifications to being able to add your own instructions and rebuild the core. Some of it is covered, like the compiler, where you can get a compiler that understands these new things. What about debug? What about protocols? There are a lot of other ramifications that we can stumble across, one after another. That can be very painful.
Graykowski: I’ve seen some interesting things, coming from the block level to the system level. There are many folks looking to improve their flows and methodologies. Not that long ago, we were just hacking together SoCs by text editing, which is very error prone. That brings a lot of issues, and it’s just tedious and painful. One of the things we are starting to see is hardware and software having a model that you can share, a central database where if something changes, it propagates out to everyone. If you have a hardware team and software team that are disconnected, and every once in a while they throw a model over the wall, how do you keep those in sync? Maybe you get a new IP, plug it in, and it changes the whole system and everything goes south. Coming up with more formal ways to build these systems, putting them together, and managing the overall process is key. A lot of folks are starting to adopt that type of methodology and bring that together. That’s not necessarily saying it will eliminate bugs, but if you can streamline that process and automate it, fewer bugs will propagate through.
Olen: That’s a really good point, and it happens even within the hardware development, let alone across different domains. From design to verification, there are many different areas of investment to help solve this problem. Does anyone really think they can say, we have the best of everything? Customers might get their formal tools from one place, or simulation from another. I hate to jump into the ‘I’m all for standards position,’ but where we have some standards, whether it’s UPF or UVM or others, we don’t do a great job of really adhering to them. We have customers that want to run heterogeneous simulation, but doing that is a lot harder than it probably should be. Why can’t I take my coverage data from one vendor’s simulator and another vendor’s formal tool, assertion tool, and bring that all into one environment? It should be easy to be able to analyze that data and make decisions.
Graykowski: I’m so happy to hear you say that.
Ganguly: People can, and I do see some customers do some of that.
Lapides: With great pain. As a consumer of EDA tools, I can tell you, that is really important.
Ganguly: There is a standard for exporting and integrating coverage data, and I have seen companies take coverage data from multiple sources.
Thompson: And I bet that team had a killer Python coding person, or team that made that happen. It did not happen out of the box.
Olen: And probably involved a very sophisticated team.
Ganguly: These are expert people who have invested a lot of effort in doing it. It is not easy.
Graykowski: I have done it as a contractor. And that’s exactly what we did. It was an in-house coverage tool versus one from the Big Three. It was not the best standard, but yeah, there was a way to do it. And I remember correlating that data and putting it together. But it was a full-time job for me.
Thompson: So that’s a full-time job. I’m already spending all this money on the EDA tools. How come I have to hire you, a really skilled guy, to do this work when I should be able to just turn it on.
Ganguly: A previous point about conformance to standards is interesting. Having lived through that experience in multiple companies, the problem is not LRM compliance. The problem is excursions from the LRM [Language Reference Manual] that are tolerated by a particular product. That’s the challenge. Everybody is LRM-compliant. But we also have something else that somebody has all over their design that the other company doesn’t.
Thompson: At OpenHW, we run continuous integration regressions using five commercial SystemVerilog simulators. That means we have to use the lowest common denominator implementation of our test benches and our RTL, because you’re right. We have an LRM, but it’s a little bit like the Bible. Everybody who reads it gets something different out of it.
SE: We talked about new customers coming in. We are also seeing new kinds of design styles. AI is a completely new area. Are there new demands coming in for verification tools or methodologies that we haven’t seen before and are they going to be driving tools in different directions from the past?
Graykowski: As an interconnect company, we see the big players are now doing multi-die interconnects. That is definitely going to bring new verification challenges. It’s not just functionality. You have to worry about the characteristics of the timing and all the signal integrity going across these dies. To make sure the clocking on this side of this particular die is connected to this one properly, everything’s synced up, and you get all the intended performance. There’s going to be a lot of need coming from that perspective as we go into these larger and larger systems.
Ganguly: I see two challenges. Superficially, there’s multi-die stuff, and this is very simple from a verification standpoint. It’s a bunch of logic that happens to be in multiple chiplets, connected with high-speed interconnect circuit. As a verification model, I don’t care whether they’re the same die, or if it’s a bus. From a challenge point of view, the first one is relatively simple. Now you’re going to have IPs that are able to do much deeper access into memory of other IP, just because the connection between them is much faster. Essentially, you’ll have a peripheral that was hanging off a PCIe chain, or hanging off a USB. That’s not the case anymore. It’s talking on a much higher bandwidth bus, and instead of accessing just the local memory, they’re now having access to the CPU level 3 cache. This class of optimization people will do for silicon performance, which will show up in verification, is going to be a very interesting challenge. Another more interesting one involves people who will be aggregating multiple dies from multiple companies into one package. Even if these dies are individually tested, how do I probe a value on a substrate that has gone into a package before I put it into a $15, or $20, or $30 package. This is a much bigger challenge.
Thompson: DFT for through-hole vias is a new thing, a new idea. We are going to have to see an awful lot more of that. And if you take a look at small startups, they are coming up with new ideas, but nobody talks about them. I was looking at a company working in the area of 3D packaging, but they are not talking about verification anywhere. I can see a train coming.
Lapides: I did my first flip chip 35 years ago. One of the things that we had in the defense industry was a very rigorous design methodology, specifications, and verification for those specifications. The silicon industry has always been about quicker returns than in the defense industry. They look down their noses at all the bureaucracy in the defense industry. With very small exceptions, things work there. It may take a little bit longer, and it may take more people, but there’s a very rigorous methodology that they follow to make sure things from different vendors are working together.
Thompson: But Apple has a clock called Christmas. The defense industry doesn’t have that.
Related
Verification Scorecard: How Well Is The Industry Doing?
The functional verification task keeps growing. How well is the industry responding to growing and changing demands?
The Next Incarnation Of EDA
Is there about to be a major disruption in the EDA industry, coupled to the emerging era of domain specific architectures? Academia certainly thinks so.
Who Does Processor Validation?
Heterogeneous designs and AI/ML processing expose the limitations of existing methodologies and tools.
Customization, Heterogenous Integration, And Brute Force Verification
Why new approaches are required to design complex multi-chip systems.
Leave a Reply