Transistor-Level Verification Returns

Technologies that had become specialist tools are moving back into mainstream usage; shift left is not just about doing things earlier in the flow.


A few decades ago, all designers did transistor-level verification, but they were quite happy to say goodbye to it when standard cells provided isolation at the gate-level and libraries provided all of the detailed information required, such as timing. A few dedicated people continued to use the technology to provide those models and libraries and the most aggressive designs that wanted to stride out along a custom or analog path had no choice but to continue to embrace the technology.

Today, new technologies such as finFETs, new physical effects from shrinking geometries, and the desire to reduce margins in cost sensitive devices are causing an increasing number of people to look deeper into the design and to dust off their transistor-level tools. Thankfully, the tools available today are much more capable than they used to be, but they are still being pushed to their limits in several important ways.

For designers who have always done transistor-level verification, the typical problem is one of size. “The biggest challenge for designers is dealing with the exponential growth in the amount of verification that needs to be done while keeping time-to-market schedules constant or within an aggressive range,” says Hélène Thibiéroz, senior staff marketing manager in the analog/mixed-signal group at Synopsys. “Increasing design complexity, larger extracted netlists (functionality and circuit elements) caused by process scaling and moving to advanced process nodes, additional features, and tougher safety requirements that need to be verified, are all factors that significantly impact design turnaround time.”

Bruce McGaughy, CTO and senior vice president of engineering at ProPlus Design Solutions, agrees. “People that have always done transistor level design, such as the memory and analog houses, just need to do a lot more simulation and bigger simulations,” McGaughy says. “They need to capture more detail and they need to run more and longer simulations because the requirement for verification of different power modes and different types of standards means that there is an explosion in the amount of simulation necessary.”

But things are changing, and they are affecting more people. “The main challenge in transistor level verification is to accurately predict the behavior of the transistor itself taking the surroundings into account,” notes Karthik Srinivasan, corporate AE manager for analog/mixed-signal at Ansys. “Technology scaling, design margining, and correct by construction-based approaches that worked in the past are either too pessimistic to meet the target, or in some cases may miss real design issues.”

As Srinivasan indicates, there are bigger changes happening.

“Those changes fall into three buckets,” explains Ravi Subramanian, general manager for the analog/mixed-signal group at Mentor Graphics. “First, the amount of functionality that is being integrated into silicon. Analog functionality and the operation of digital transistors at very low currents and voltages directly translate into a requirement to do transistor-level verification on much larger sized circuits and blocks. Second, people are shrinking the supply voltage and the process variability is increasing. This means that more accurate models are required because there are more devices and now more parasitics and variation. Those models must incorporate the electrical impact of increasing numbers of physical effects. Third, the type of analysis that has to be done and the time available to do that analysis are both big challenges.”

Shrinking layouts also add to the complexity. “Gone are the days where you could sign off your block or macro in isolation with a few typical vectors,” says Ansys’ Srinivasan. “Now it needs to be verified in the context of the design, chip-package and maybe even the entire system. What were considered to be second- or third-order effects are gaining importance today due to technology migration and higher levels of integration.”

Standard libraries and design techniques have long relied on margining. “The economic and competitive impact of overdesigning becomes more severe at advanced nodes,” explains Yoann Courant, R&D director in the process variability group of Silvaco. “Most design teams working at advanced nodes are adopting variation-aware design flows that include utilizing advanced Monte Carlo, statistical corners, sensitivity analysis, high sigma estimation, and other related techniques to ensure that design margins are well understood and trade-offs can then be made knowing the potential impact in terms of yield.”

Those standard libraries are the result of an iterative process. “The first product out of the foundry is the device model,” says Hany Elhak, director of product management for circuit simulation and library characterization at Cadence. “They do library characterization and create test chips, and based on the feedback from those they optimize the process for leakage current, for power, for speed. And then they create new device models that are used in characterization, leading to other test chips that they measure.”

It is within that process that margins are added to ensure sufficient yield. “Customers that require very high performance or low power tend to create their own libraries that are optimized for each,” explains Elhak. “They may do a mix, where they get the base library from the foundry but do some modification, or get libraries characterized for a certain corner or in-house characterize them for a different corner.”

It is somewhat ironic that the slowing of is indirectly affecting this, as well. “There are more design starts on older nodes,” points out Subramanian. “There will be a healthy number of design starts on 55nm and 45nm, and even 130nm. Design teams are now trying to squeeze more out of those nodes in order to maximize what they are able to get on that node and not have to move to more advanced nodes. To do this you not only have to quantify your digital margin, but also your analog margin. This is new. We are now seeing people ask the question, “How do I quantify my design margin for analog circuits?'”

To fully understand the problem, it is necessary to dig down deeper into the reasons for the complexity increase.

Smaller geometries
Some of the problems are just because devices are getting down to atomic dimensions. “In previous generations you had hundreds or thousands of atoms in the length of a channel but today it is down to tens,” says McGaughy. “The number of dopant atoms for a threshold voltage implant is in the range of tens of atoms for a small transistor. The back-end process is changing, as well, and this affects the wires. It becomes more challenging when you get to small dimensions in the metal layers because it creates bigger variations in the metal resistance and capacitances.”

“Variation exists in every process,” says Elhak. “But as the dimensions of the device gets smaller, the device sensitivity to variation increases. At 90nm, analysis was only required for very sensitive analog blocks. With 14nm, variation analysis is standard even for digital designs. But it is not just variation. There are other effects that get amplified at these nodes. (EM), ageing, the degradation of a MOS transistor over time which is caused by thermal effects.”

This has added a significant requirement to transistor-level simulation. In the past, accurate voltage waveforms were necessary, but today accurate current waveforms are required as well so that power, thermal and other factors can be verified as well.

But it doesn’t stop there. “Another factor is layout-dependent effects,” adds McGaughy. “The layout itself is causing big variations from different patterns. You may have transistors that in previous generations had the same geometries, and they could be considered to be identical devices. But because of the layout around them, those transistors are no longer identical.”

New devices
“With the transition from the planar transistor at 28nm to the finFET at 16nm and below, we are seeing a move to a new type of BSIM model, and the complexity of that model has increased significantly,” says Subramanian. “These models are equation-based models and the complexity is measured by the number and type of equation in these models. Going from planar to finFET, the modeling complexity has increased by over 100X in terms of the raw number of computations required per transistor. That means that for every transistor, you need to 100X more computations.”

McGaughy explains one of the reasons for this added complexity. “The gate is now wrapped around the channel and this is a 3D device, so the source and drain have more coupling to the gate, as well. That causes more feed-through of charge when the gate is switching. In previous generations, we worried about the variations in the IV curve, so if you could get the threshold voltage right and understand its variation, then with the transistors IV characteristics you were most of the way there. Now, the charge variation is a significant factor and it makes it much more difficult to model and extract variation in charge.”

Not everything about finFETs is a positive either. “Although the FinFET transistors provide high drive strength, they have poor heat dissipation due to 3D finFET structures,” explains Srinivasan. “With higher device and wire density, higher power consumption and Localized Thermal Effect (aka self-heating) become significant, impacting the reliability of the devices and interconnects significantly.”

Interconnect length increases exponentially with technology scaling. “In new chips you have very long interconnect wires with very small dimensions in terms of thickness and depth,” points out Elhak. “That creates high current density and high operating temperature. Copper has been replacing aluminum to meet some of the technology challenge, but EM is still a challenge. When electrons flow in the metal, first they heat it and then they put pressure on them which can cause damage. In the past we only did this analysis for very sensitive analog blocks or high power blocks used in automotive, but today mainstream designs in wireless and wearable that are moving to advanced nodes have to take this into account.”

Reliability issues associated with the ageing of the interconnect are caused by two phenomena – hot carrier injection and bias temperature instability (HCI and BTI). Hot carriers are particles that attain enough kinetic energy to be injected into forbidden regions of a device, such as the gate dielectric and then get trapped. This leads to threshold voltage changes and trans-conductance degradation in the device. Bias temperature instability is caused chemical breakdown at the interface between the silicon dioxide layer and the substrate and causes an increase in the absolute threshold voltage and a degradation of several attributes of the device. Both of these effects can lead to damage of the devices over time. Libraries now need to be aware of all of these issues.

Oxide thickness and mask alignment are the main contributors to variation. “Consider oxide thickness,” says Elhak. “As it gets smaller we are talking about just a few atoms within that layer so any small change in that thickness will contribute a bigger variation in current because current is the average flow of electrons. As the oxide gets smaller, the more sensitive it becomes to slight changes. The same is true of mask alignment and other contributors to variation.”

What makes some of this variation more difficult to handle is that the variation is random. In the past, it was reasonable to assume that devices close together would be equally affected by process variations. This is no longer true, and it can disturb sensitive device structures such as matched pairs or current mirrors.

In the past, designers dealt with variability by adding margin, but now the sensitivity to varying parameters is harder so you would need to add even more margin. As designers are competing on the performance of chips, you cannot keep adding more. You have to analyze the variation and it has to be taken into account during the design process.

“Monte Carlo is the traditional approach for variation analysis,” says Courant, “but it is becoming too costly as thousands of runs are needed in order to statistically converge to an acceptable precision. Advanced Monte Carlo techniques are required that provide speedup in terms of number of runs for small to large circuits.”

Subramanian points out that there is a move to use statistical techniques more intelligently. “We are at the early days of this. People are looking at, and starting to use an approach called ‘The Design of Experiments’.”

For every set of simulations that are run, it is like running experiments on a population and you need to pick the size of the population you need based on what you are trying to measure. “Increasingly, people are looking at how many simulations should be run in order to get a certain confidence interval for a specific type of measurement,” explains Subramanian. “If you have five different types of measurements, you would actually define five different types of experiments to be able to capture those measurements with a certain degree of confidence. Very late in the design cycle, close to tape-out, you would require a very high confidence interval to make sure that the products coming out of the door fits within a certain window in terms of their performance.”

Subramanian says that by looking at customer designs and nodes as they move to 28nm and below, they see a dramatic increase in the number and types of simulation. “We are also seeing this on other nodes where there is increasingly a requirement to do more types of reliability analysis – for power electronics, LED lighting or certain automotive power functions, they need far more simulations and they are bringing statistical techniques to bear to help manage the complexity.”

Standard cells and memory bitcell designs are among the most critical and sensitive to variation. “We are still seeing teams using simplified statistical analysis,” says Courant. “These include Gaussian extrapolation (even when the actual distribution is not Gaussian) or increasing sampling in the distribution tails with the hope of generating more failures. What is typically observed is that some designers are doing as much Monte Carlo simulation as possible depending on available time and resources without being actually aware of the final precision of the analysis.”

Memory bitcells have long been one of the most active users of transistor-level verification. “The cell is very small and usually between 6 and 12 transistors, but it is repeated millions of times,” says Elhak. “This means that any impact to the cell is multiplied by the repetition. Today memory designers have to do intensive statistical analysis with very high sigma. Usually this is at least 6 sigma. Knowing the probability of a single cell to fail does not tell you much about the reliability of the whole array. You have to run Monte Carlo analysis using SPICE and a lot of statistical techniques to simplify the analysis so that you can reach the desired sigma levels.”

This has been made a lot more difficult with the migration to finFETs. “If you are designing an SRAM you can size to any W you want in previous generations of CMOS, but with finFETs you cannot size, you can only choose the number of fins,” points out McGaughy. “That is the limiting factor in optimizing the sensing amps, the address decode word-line drivers and the bitcell. The limited range of choices makes it difficult to optimize for variation.”

Tool advancement
Thankfully, the EDA industry has been responding to the challenges and both performance and capacity have been increased significantly over the past decade. “A decade ago, if you said that a transistor-level simulation tool could handle over five million elements it would have been considered preposterous,” says Subramanian. “Today the highest capacity SPICE-accurate transistor-level verification tools can handle well over 15M to 20M elements.”

Thibiéroz lists some of the other areas in which tools have risen to the challenge. “Today we have multiple testbenches and formats with thousands of simulation scenarios to set up and monitor. Tightening constraints for healthy yield requires more process, voltage and temperature (PVT) corners and Monte Carlo simulations to be run, and efficient post-processing of verification and analysis data is becoming a must.”

So how has performance been increased even with all of the additional complexity? “The breakthroughs are with the advancements made in parallel simulation,” says McGaughy. “Today’s simulators are built from the ground-up to take advantage of parallel hardware. Moore’s Law for single processors stopped scaling 10 years ago, and the majority of the advancement in hardware or processing power is coming from parallel. This also means having data structures that are optimized for the amount of memory they use and their efficiency on those hardware platforms.”

Design techniques also have been changing, and this is one place where shift left helps. “The most common techniques are to identify hotspots and fix them by running as many Monte Carlo samples as early as possible given project schedules and resources,” says Courant. “Using sensitivity analysis or statistical corners can help make this process more efficient. In some cases correction circuits may be used to mitigate variation risk. For layout dependent effects the circuit design and layout stages have to couple more tightly.”

There is more work that has to be done and new standards that have to be completed. “The main challenge is getting the right level of model abstraction,” says Subramanian. “When you build a model for something, you want to show that it has the right accuracy for the signals that matter compared to the actual transistor-level model. The second challenge is to be able to have a methodology that lets you move in and out of each level of abstraction for different blocks. As a simple example, you could have a phase-locked-loop that has four or five different blocks and you may like to have only the VCO as a fully extracted transistor-level netlist and everything else written in a behavioral model which could be either Verilog-AMS or SystemVerilog-AMS.”

SystemVerilog-AMS is one of those works in progress. Verilog-A was initially created in 1996 and the latest version (2.4) was released in 2014. However, Verilog is no longer an active standard and this effort has to be migrated to SystemVerilog. In addition, work is underway to focus on new features and enhancements requested by the community to improve mixed-signal design and verification, as well as to extend SystemVerilog to analog and mixed-signal designs through the IEEE 1800 subcommittees.


Kev says:

Accellera’s Verilog(-AMS) may not be an active standard, but the large amount of pushback over the years from the SystemVerilog committee(s) about absorbing its functionality into the IEEE means it’s somewhat unlikely that there will be a functional SystemVerilog-AMS for a number of years, if nothing else: fixes due in Verilog-AMS have been pushed down the road. Which is a pity since it’s actually quite easy to model variability in Verilog-A (along with CDC error detection, and power) for large digital designs.

Leave a Reply

(Note: This name will be displayed publicly)