I’m Almost Done

How do you gauge how complete a project is when the last 10% is the hardest part?

popularity

The city of Belgrade is renovating the street where I live. They are also building a new building next to mine so that I can see the construction work from my balcony.

Last week, they blocked the street for some 20 minutes, and people got out of their cars and waited outside for the road to open. The construction workers were not in a hurry, and it seemed like everyone was ok with that, so I took a few minutes to chat with them. One guy said they had started the work in January, and they are 90% done now. Having my own project’s “almost done” status in mind, I asked him when they would finish, and he said, well, we are 90% done, and we started nine months ago. We will finish in a month.


Fig. 1: My street is under construction.

Coverage-driven verification is the best not-perfect idea we have about how to run a verification project.

The idea is simple:

  1. There is no real ‘100% verification done.’
  2. But we have to manufacture and sell new chips.
  3. So we will decide what our 100% should be and then code it into the testbench (functional coverage).
  4. We will do it early on in the project, following the progress as a metric to our task’s completion.

It is like capitalism: I can fill ten pages with why it is not right, but I stick to it because I don’t have a better idea. Yet.

If we inspect the convergence of functional coverage in a typical project, assuming it is written early on in the project, we will see something like the chart below.


Fig. 2: Functional coverage progress over the time of the project.

If you ask the verification engineer what the status is in week 15, you will most likely hear: “I’m almost done.” Especially if the project was planned to last, say, 17 weeks.

In a sense, the verification engineer is right: the testbench is stable, extensive tests are being run with few failures, and the chances of finding a new RTL bug are low.

But timewise, the project is only about 60% done, though nobody can predict that at this point.

This time period of getting from 90% to 100% coverage is where we spend a significant portion of verification time, and furthermore, this is the time that everybody hates. On top of that, the time left before the finish line is highly unpredictable at this point.

Industry surveys show us that 30-50% of verification efforts go to debug. A closer look reveals that 80-90% of that goes to inefficient debug. Inefficient debug is the cyclic process I discussed at length in my first blog:  a verification engineer circling around a failing test, finding out at the end that they misunderstood the story of that scenario. It happens more and more as the project progresses, with failing tests or with coverage holes we’re trying to fill.

This debug is also inefficient in the sense that it does not reveal an RTL bug and does not teach us anything. Three quarters of the way into an IP verification cycle (for example, in week 18 in the graph above), one would expect that the verification engineer would already be an expert of her/his testbench, finding test problems quickly.

This leads us to the second big problem: There is no efficient way for an understanding of the testbenches to settle down. Similar issues are being debugged again and again. People tend to forget.

At Vtool, we propose shifting the verification paradigm from debugging failures and missing coverage to diagnostics. We’d first like to understand the story, and when we see a failing test, we can first ask: “What the heck just happened?” It’s like when you watch a brilliant soccer play and cannot understand what happened because it was too fast for your mind to capture. Then you do the slow-motion replay, and you get it. Our primary goal is to provide engineers with this “slow-motion” tools.

Cogita, our simulation diagnostics and debugging platform, aims precisely at that.

Fig. 3: Helping to straighten the curve of verification convergence.

When properly utilized, it is set up for the task during initial debug phases, while cases are still relatively easy. Then, when the hard last 10% of coverage fill comes, it really kicks in, helping engineers straightening the “almost done” curve.



Leave a Reply


(Note: This name will be displayed publicly)