Process Detection & Variability

Designing for worst-case process variation can erode the gains made by migrating to an advanced node.


A Q&A with Moortec CTO Oliver King.

What do we mean by process variation?

Process variation is a complex subject which covers a range of effects, but broadly we can consider that the effects are caused by imperfections in the manufacturing process. Examples are implant variations, mask misalignments, and optical variations. These all add up to give statistical variation on the ideal or “typical” transistor.

However, the mechanisms and causes of the variation are not particularly our concern. What we are interested in is being able to measure in a meaningful way where a particular piece of silicon is within the defined process space for the technology being utilised.

Because ultimately the measurement of process is being used to assist with optimizing performance, we relate process to speed and in advanced nodes this comes down to MOS device speed and the parasitics of the interconnect.

Why is process variability becoming an advanced node issue?

Process variability has always been an issue and the design process has taken account of the variability, often by designing for worst case. Whilst this is still possible, doing so is eroding a larger proportion of the gains made by migrating to an advanced node and when this is coupled with additional sensitivities to supply voltage (an effect of the dropping of core supplies) we are at a point now where process variation and, specifically, designing for worst case is too high.

Furthermore, with the advent of FinFET processes and the fabrication methods to allow for the densities seen on current leading nodes, the process variation is manifesting itself in different ways. Due to the limited availability of production data on these nodes it is too early to say we fully understand process variation.

How do you use process data?

There are a range of applications for making use of process data. The first is using the data for performance optimization. The simplest implementation of this is speed binning devices. A more detailed performance optimization is to use this process data to optimize that die with a DVFS optimization process. This can either be once, to take process into account, or could be used over time to account for temperature and even ageing. It is possible, for example, to reduce power consumption to achieve a desired speed of operation. It is also possible to take process variation across a die into account, which is being done in large SoCs today.

Another application is in detecting aging of a chip. This can be either as part of performance optimization, as described above, or to predict device failure.

How does this relate to other in-chip conditions?

Ultimately process, voltage, and temperature are all interrelated because they all determine how fast a chip will work, or how much power it will burn for a given task. Having accurate measurements of all three allows SoC designers to take advantage of the performance which would otherwise be left as margin.