Just when you thought it couldn’t get any more complicated…
It used to be fun to be a chip architect. You could wake up in the morning, grab a cup of strong black coffee and run through a few power and performance tradeoff calculations before deciding on the high-level architecture. That would set the engineering direction for months, if not years. On a good day, after introducing a steady infusion of caffeine into your bloodstream, you felt like the all-powerful creator of an electronic universe.
That dream job began showing its first signs of vulnerability at the 130nm process node, especially as the SoC began emerging as the leading design platform. The job description began weakening further at 90nm, and by 65nm it has transcended into something far less satisfactory—and the trend only gets worse from here. More people are entering into the conceptual design phase of building a chip with each rev of Moore’s Law. Suddenly, there are people talking about power budgets and yield and verification engineers trying to build in ways to solve their problems earlier. Managers are screaming for first-time silicon success. And software engineers—who, incidentally, no one has ever understood very well—are now sitting at the table at initial conception, slurping Diet Coke or Mountain Dew, and speaking a language no hardware engineer can understand.
Welcome to the brave new world of hardware engineering. It’s called system-level design, and it’s become so complex that just to get the job done now requires steady and concurrent input of multiple disciplines. Engineers are struggling to keep up with multiple power domains, multiple cores that exist only because classical scaling for performance died at 90nm, and timing issues that get complicated by shared busses, shared memory, and shared resources within engineering groups.
“The technologies for low-power design are well understood for silicon,” says Nikhil Jayaram, director of CPP engineering at Cisco Systems. “The challenge is in the complexity of those technologies. You have to ask yourself, can you pull it off in a reasonable design cycle?”
The answer is always yes, of course, but the cost is not always easy to swallow. Complexity is measured in terms of additional resources. Jayaram said that number is about 20% to 50% extra per design, depending upon the complexity of the design itself. Why? “You have to buy more tools and use more people.”
There are plenty of tools, too. In order to address this complexity, vendors have been introducing a steady stream of new tools that raise abstraction levels or combine multiple tasks. Those go hand in hand with new standards such as TLM 2.0. But the learning curve on these new tools and standards is quite steep, demanding time from engineers who are hard pressed already. Even the IP that is supposed to simplify chip design and development is so complicated that it often needs additional IP just to be able to ensure it can be debugged or manufactured properly.
One verification engineer at a very large, well-known chip maker (he asked for anonymity because he didn’t get approval from his bosses before talking to System-Level Design), said overload is becoming a serious issue among engineers.
“Designers are required to become experts in three completely different languages that the industry has standardized on as mainstream,” says the engineer. “The languages are SVA (System Verilog Assertion) for the assertion-based methodology, SV (System Verilog) for the testbench methodology, and C/C++ for system-level hardware/software verification. A verification engineer cannot get by without becoming an expert in these three languages. The way to deal with this is through the right schooling so that engineers come out with the expertise in all three. Standards have definitely helped with this. The frustration of course will be for the engineers that are on the job for many years and now need to become skillful in three different areas. As things are today, I am finding it very difficult to justify all three methodologies to my customers and they are missing out on quality because of this.”
That’s only part of the problem in verification. While five years ago engineers were complaining about getting too little data back from foundries such as TSMC, UMC and Chartered Semiconductor, they’re now complaining about being flooded with data. There are volumes of it—literally—and there’s no way other than just plain luck to pinpoint a bug without running tests on broad areas of that data. TLM 2.0 purportedly will help (see related story), but it also has a fairly high learning curve to be able to use TLM 2.0 tools. How do you construct a test model, for example, using object-oriented code?
There’s a reason why verification is still 70 percent of the NRE time budget and cost for developing new chips. Despite throwing lots of money, resources, and the best minds in the world at the problem, that number hasn’t budged much.
IP, Verification IP, and insurance IP
Nowhere is this overload more evident than in the IP world. Why write a piece of code for a standard interface or a piece of memory if someone with experts on the bleeding edge of technology has already done it? That way of thinking is growing. IP is a big market, and the problems of five years ago when companies bought advanced IP only to face challenges—and potentially huge expense—getting it to work are enormous.
Buying IP isn’t like buying a pair of shoes. It’s more like setting up a deep partnership that lasts for the life of a chip’s many iterations. And getting those partnerships to work properly can be a time-consuming process. That explains why many of the smaller IP companies have evaporated even though a decade ago pundits said the barrier to entry for IP startups would create a vast array of parts that could be simply plugged into a system on chip. Things didn’t work out so well in the real world.
“When you walk in to a partnership you need to get a complete match on the methodologies and tool sets,” said an engineer, who spoke on condition that he not be named. “This is soooo difficult. Very high level managers are finding themselves bleeding trying to make this work. Your tool set may be delivered by multiple vendors in addition to internal tools. Internal tools cause even more problems that are related to support, IP, etc.”
The engineer noted that standards will help solve this—everything from standard formats, standard languages and standard methodologies, which is what the new verification IP committee is trying to tackle.
Business, As Usual?
Beyond all of this, there is the incursion of the business groups. It was bad enough to build chips that worked. Now they have to be built on time, within a financial budget, and they have to include more complex technology and tricks than ever before.
One solution for keeping chips in budget is using the lowest-cost tools. The problem with that approach, say engineers, is that not all tools share exactly the same functionality. So what happens when you run simulators such as VCS (Verilog Compiler Simulator, formerly from Chronologic but now owned by Synopsys), IUS (Cadence Incisive Unified Simulator), and (Mentor Graphics’) ModelSim? The answers to that question vary by project, and frequently for the same project.
But no matter how bad it looks, at each new process node there will be more cooks in the kitchen. You can fight it, ignore it, embrace it, but know that only the last choice is the right answer.
—Ed Sperling
Leave a Reply