Exclusive: What happens when you put formal leaders from all over the world in the same room and give them an hour to discuss deployment?
A very important change is underway in functional verification. In the past, this was an esoteric technology and one that was difficult to deploy. It was relegated to tough problems late in the verification cycle, and it was difficult to justify the ROI unless the technology actually did find some problems. But all of that has changed. Formal verification companies started to use the technology to make solutions.
Semiconductor companies started to deploy these solutions to solve selected verification problems in their designs. Those solutions tackled narrow, well defined problems and they did it in a way that did not require knowledge about the algorithms being used. This transformed formal from being a tool for the rich into an indispensable tool.
That was phase one of the transformation. Phase two is underway now. Formal verification is no longer just a back-end tool, but in many cases is seen as the first verification tool to be deployed — long before simulation testbenches are ready. Semiconductor Engineering was invited to take part in what is probably the most amazing collection of formal leaders to have ever gathered in a single room to discuss the deployment of formal. Fifteen experts came from all corners of the world and from just across the street. The event was organized by Oski Technology, a formal verification services company.
The event was started with a keynote talk given by Dan Lenoski, co-founder and vice president of engineering for Barefoot Networks. He talked about the specific challenges that his company faces in the creation of high-performance networking, and specifically the verification problems it faces with a new software-defined switch.
“A lot of the lookup problems are a lot more complicated than most people would guess,” explains Lenoski as he goes through the various elements of the system. “It includes things such as longest prefix match and ACL lookup, which often requires a TCAM lookup and may include wild cards or ‘don’t cares’ and possibly include ranges. If it is a fixed pipeline, you need an architectural reference model, and you can do lots of constraint-based random testing and for certain pieces. Formal property checking is possible.”
Lenoski explained how the memory architecture can affect the performance of the system and even lead to packet dropping. He said the verification of this is very hard. “Debugging is also hard. You can suddenly see garbage coming out, or because of a memory leak you start having problems two seconds into operation. It takes thousands of cycles between interesting events.”
Formal cannot be directly applied to this type of problem without being smart about it. “You don’t need to consider 64 bytes; 1 byte will do. You don’t need all of the queues. We have had success with using formal to help verify these types of problems.”
Those were the problems of the past. “We are now looking at making the system a lot more flexible. You can take protocols described in a language called P4, and using a compiler, map that into the different stages of the new architecture. Before, you had to re-spin an ASIC to modify a pipeline, but now we can define it in a piece of software and achieve the same performance. It does, however, add more verification challenges.”
Lenoski and his team are still working out the best way to apply formal to the problem, including using formal on the P4 description or equivalence checking between it and the output of the compiler. “We need best-of-breed techniques to get the job done, and that requires constrained random, proven VIP, Emulation and formal methods.”
Following the keynote was a roundtable conversation. What follows are generalizations of that discussion. Most of the attendees asked that they not be quoted directly, so all comments have been paraphrased and made anonymous. The full list of attendees is provided at the end of the article.
Seated (L to R): Youngsik Kim, Dave Parry, Ram Narayan, Ashish Darbari, Syed Suhaib
Standing (L to R): Ali Habibi, Jesse Bingham (photobomber from Intel, winner of Oski Chessboard challenge), Richard Ho, Normando Montecillo, Stuart Hoad, Roger Sabbagh, Vikram Khosa, Vigyan Singhal, Viresh Paruthi, Erik Seligman, Swapnajit Mitra, Dan Lenoski
Dave Parry, chief operating officer for Oski, got them started by asking how they presented the ROI for formal within their respective companies.
One way is to define it as an insurance methodology. “In order for us to prove that formal is good insurance we had to combine functional and formal in such a way that functional verification quickly found issues and problems such as typos, compilation errors and beyond. Then formal comes in and figures out the deep-rooted problems. This has been very successful. I am not sure if this increases the ROI, but it makes formal more valuable because it provides a good methodology to deploy.”
Others are experimenting with different deployment approaches. “We used to let functional verification do its thing for months and then start using formal. The number of bugs found by formal was always less but of higher quality. Now we are trying a modified approach where we deploy formal early on. With formal you can write a cover property, and using a tool you can draw the traces and hit the button. We have also seen designers that want to prove assertions. This provides motivation to write checkers early on, and this is helpful in the long term.”
Several of the members confirmed that formal is being used a lot more in the early stages of design, but they have to counter the perception that formal is being used for the easier bugs that simulation would have been able to find anyway. “It is not really the easy bugs. If you deploy formal early on you may hit some corner cases. The goal is to bring bugs that were found late in the process, earlier.”
It was pointed out that two different things were being discussed. “There is the motivation of a post-silicon bug or type of design that is very formal-leaning and you want to push formal on that. Then there is another thread of doing early formal because it is more efficient than other methods. We have to separate out those two usage models because they have different kinds of ROI.”
Another person cautioned about how ROI is defined. “When using formal, engineers tend to set up their environment but they don’t actually file any bugs reports because they just fix bugs as they find them. That means that management assumes that they didn’t find any bugs with formal.” One response to this is that there must be feedback in the system to make sure management knows the value that is being provided.
This led to a heated debate about the value of counting bugs as a measure of goodness. “It is not a bad thing if I don’t find bugs! The best way is to find as many bugs as you can and in some projects they can find them quicker in simulation. It is not a competition.”
There was a general consensus that the food fight has to stop. One of the members recounted a story from within their company. “On one design, formal found several critical bugs just before tapeout and the team received lots of management attention and praise. However, on another team, formal found no bugs but did obtain a proof of correctness. They received no praise or attention.” Others agreed that verification teams should not be rewarded based on bugs found, especially if you get a really good designer who creates few bugs to start with.
Another approach to get formal deployed is to look at problems that are very difficult for simulation, but easy for formal. One member said that “the first thing we did to expand formal was to consider top-level muxing. We have a chip with a bunch of cores, and then use muxing to the external I/Os. That structure is simple and we used to do it in simulation, but it takes about two weeks. The problem is that you have to do it as the last things before tapeout. With formal you can do it in six hours. If you present it to management in that way, you get the necessary resources right away.”
There are several thoughts about where formal should be applied. “There is no reason to do the same things that simulation can find. It is a waste of time. Some of the blocks are trivially verified in simulation or best suited to emulation. For legacy designs, where everything is already in place, that is best left for an emulation environment. There is no point in trying to replace it with formal.”
But not all agree. “There are cases where simulation would only find something after a million years. It is not about what could have been. If there is a massive overlap in the kinds of bugs you are looking for, then I agree it would be more challenging to justify. In a new design, anyone can find bugs using any technique. But you can show the designer that some of the bugs that you found through formal are almost impossible to find in simulation because they are nested deep within the FSM. Time is an important aspect because everything is related.”
One way around the problem is to deal with it in the verification planning stage. Now, it is just a matter of which verification tasks are assigned to simulation, emulation or formal, and that each of them does the job that they were assigned. This is purely based on the capability of the tools. “There are a lot of legacy simulation testbenches and the testcases for those are already developed. We are not going to throw them away and stop doing simulation. For attacking concurrency at the SoC level, we use emulation. There is no comparable solution. With an FPGA you can achieve 5MHz or so, so you can start bringing up some software. If you stick with simulation, that is impossible. With formal it is similar. For areas where it is well defined, it can be trivial for formal but very complicated for emulation. Each has a niche.”
But there are times when a legacy design changes in ways that can cause a re-evaluation. This could be bugs found, a change of frequency or a technology change. Add to the mix that the original developers of the design or the testbench may no longer be available and a simple modification of the legacy simulation environment may no longer be the best approach.
Another question that arose was the construction of the teams and who should be applying formal. “I have not been successful getting designers to use formal during the early stages. The designers who are attempting to it don’t do a very good job. Those who are successful are a very small percentage. What we have found better is to get a highly qualified verification person to do the formal. They can do more than three designers.”
Again, the discussion divided between those who were looking for deep issues versus those attempting to use it early in the design process. “The designers can go very deep into the design, but sometimes going more abstract can lead to more interesting test cases. You need a combination of both.”
While there was agreement that designers hold essential information that makes verification easier, that is only a part of what is needed to complete the task. “It is a challenge to write assertions. It takes a certain mentality. When I look at the assertions we write today, a lot of them mimic the RTL. It is really hard to change the mentality and say that is not what we want.”
So who should be writing the assertions? “All of the success we had with formal verification was achieved without a single assertion being written by designers. All of the assertions and properties were done by the verification team. The designers today are overburdened. The designers design. The verification person is responsible for writing the properties to verify the design, and that marriage has been satisfactory to the point where we have 100% buy-in from every designer and verification team who has applied formal.”
No conclusion was reached about the flexibility of people within the verification team. Some were concerned that the construction of testbenches tends to be very inefficient, but this is not a problem for simulation. However, when addressing formal techniques, efficiency is paramount if results are to be expected. That again requires a different mentality and different qualities for the people deploying formal. As with other parts of the discussion, this varies depending on whether deep bugs are the target or one is using formal apps that are tailored to very specific problems.
What is clear is that formal is being used in increasing amounts by the largest semiconductor companies around the globe. We are past the stage where formal has to be justified. It is more a question about the places where the benefits are the greatest, and on that topic there is still plenty of discussion to be had.
Participants included: Ali Habibi, Qualcomm; Richard Ho, Google; Swapnajit Mitra, Avago; Ashish Darbari, Imagination Technologies; Vikram Khosa, ARM; Ram Narayan, Oracle; Erik Seligman, Intel Corp.; Roger Sabbagh, Huawei; Youngsik Kim, Samsung; Stuart Hoad, Microsemi Corporation; Normando Montecillo, Broadcom; Syed Suhaib, Nvidia; Viresh Paruthi, IBM and Dan Lenoski, Barefoot Networks.