Some are getting easier, others are getting tougher. Here’s why.
Standards for specifying a chip’s ability to withstand electrostatic discharge (ESD) are changing – in some cases, getting tougher, and in others, easing up. ESD protection has been on a path from a one-size-fits-all approach to one where a signal’s usage helps to determine what kind of protection it should get.
Protecting chips from ESD damage has been a longstanding part of IC design. Requirements and circuits were stable for many years – such that it was a matter of simply plugging in the IP and moving on. Most designers didn’t have to spend any energy making ESD-related decisions until about 10 years ago. “It wasn’t a critical signoff thing. That’s changing,” said Karthik Srinavasan, senior product manager at Ansys.
As silicon processes advance, and as we look more closely at the data, we may have been too hard on ourselves in some cases. And we may need to provide more protection in others.
Two Levels of scope: chip and system
There are two different sets of standards for ESD from different organizations, and they represent very different concerns. At the lowest level, chips may be damaged if there is an ESD event during manufacturing. As a result, JEDEC and the ESD Association (ESDA) have a series of specifications for ESD protection on all pins of a chip. These ratings apply only up to the point where the chip is placed onto a circuit board. After that, the pins are less vulnerable.
“The focus is to get the chips onto the board without failing,” said Bart Keppens, director of business development at Sofics.
While an assembled board may be more robust than an individual chip, ESD is still a concern at the system level. In this case, it’s not a concern about manufacturing, but actual use. “Customers expect high ESD protection nowadays, especially in applications with a lot of human interface,” said Robert Gee, executive business manager for the Core Products Group at Maxim Integrated.
While chip-level ESD deals with every signal on a chip, system-level ESD focuses only on signals that make it to the outside world. Consumer-oriented signals like USB and HDMI can be vulnerable if someone touches them after shuffling across a carpeted floor. System-level ESD is specified by a series of IEC 61000 specifications.
The specs and testing methods for the chip-level and system-level protections are different, although, with advanced packaging there is some gray area. (The focus of this article is strictly chip-level ESD, as addressed by the JEDEC/ESDA specifications.)
There is interplay between those two levels. More protection at the chip level can make system-level protection easier. Protecting the system requires the addition of discrete passive devices, and so it impacts the bill of materials (BOM). “Higher ESD on-chip means less external discretes,” said Keppens. If the chip takes up more of the protection burden, then fewer passives are needed, reducing the BOM.
The challenge is that, as process dimensions shrink, chip-level ESD protection takes more area. “Advanced silicon is too expensive for that,” said Peter de Jong, specialist for ESD and latch-up at Synopsys. So there’s push and pull between the desire for more chip-level protection and the cost of providing that protection.
For customers that really don’t want to have to deal with system-level ESD, lower-cost chip-level ESD protection can be provided by co-packaging transient voltage suppressors (TVSs) with a chip, observed Imec ESD Team Leader Shih-Hung Chen. To the outside, it looks like a single high-ESD chip. Inside, inexpensive TVSs are used instead of expensive on-chip ESD circuits.
Three reliability concerns: ESD, EOS, and latch-up
While the focus here is ESD, it’s closely intertwined with two other reliability notions — electrical over-stress (EOS) and latch-up. Matthew Hogan, product management director at Mentor, a Siemens Business, said that they’re all aspects of what is now being called, “Electrically Induced Physical Damage,” or “EIPD.” This general category is party aimed at providing better failure-analysis (FA) statistics. And pins are becoming more vulnerable at advanced nodes: “FinFET and GAA devices are more prone to latch-up due to higher vertical resistance. It can be cause by ESD, although there are other triggers,” he said.
As Hogan described it, a damaged part would be returned for FA, and upon first inspection the cause might be listed as likely caused by ESD. But the full investigation may not bear that diagnosis out. And to make matters more confusing, the preliminary diagnosis is often not updated to reflect the final outcome. He suspects that ESD failures are overcounted as a result. So the EIPD category provides a more general first-pass diagnosis without biasing the report against any of the EIPD factors that might have been the root cause.
ESD refers to the mechanism by which a problem occurs. According to some, one of the possible outcomes would be EOS, and some people do consider ESD to be a subset of EOS. Others view them as separate. Imec’s Chen said that ESD is a nanosecond-level event, while EOS is a millisecond-level event.
Latch-up also may cause EOS – and ESD may cause latch-up if a device is powered on. Testing for ESD with power applied is now performed to ensure that latch-up doesn’t result. So while the three phenomena are treated as distinct, they have an effect on each other.
Three ESD models – and then there were two
There have traditionally been three different ESD models: the human-body model (HBM), the charged-device model (CDM), and the machine model (MM). They are governed by different JEDEC standards: HBM and MM are covered by JS-001; CDM is covered by JS-002. The ESDA used to maintain its own specifications, but in 2010, the two organizations started harmonizing their specs to avoid confusion.
HBM attempts to model how a human touching a chip would deliver energy. The current results in a moderate voltage at the pins and a long tail as the charge is depleted. By contrast, the increased amount of robotic testing and assembly make the CDM increasingly important. That model injects current much more quickly, ringing after the first pulse rather than decaying slowly.
Meanwhile, the industry has eliminated the MM model. As stated in JEP172, “JEDEC, working with the Industry Council … strongly recommends discontinuance of the Machine Model for ESD component qualification requirements … MM is redundant to HBM at the device level since it produces the same failure mechanisms, and the two models generally track each other in robustness and in failure modes produced … The test method was incorrectly given the name ‘machine model’, though no firm, unique connection between the model and actual machine-induced device failures was ever established.”
While both HBM and CDM are active specs, different applications and technologies will prioritize one or the other – and CDM appears to be gaining greater focus. For devices with finFETs, the primary concern is CDM. Automotive manufacturers, by contrast, are concerned about both HBM and CDM.
Evolving specs
One of the major changes resulting from the ongoing discussions is a move away from requiring the same specs on every pin, which has been 2 kV for HBM. In fact, the automotive industry has put out its own CDM ESD specification, AEC-Q100-011, and it includes, among other things, options for tighter specs on corner pins, since, as noted by Synopsys’ de Jong, “a corner pin is more likely to be hit.” There is also an HBM spec, AEC-Q100-002 which largely reflects the JEDEC spec, albeit with some procedural modifications.
But more change is afoot. The ESD Industry Council is hosting conversations about further changes to ESD requirements. The Council isn’t an accredited standards body, but it’s working with JEDEC to lobby for reduced voltage levels. This is partly driven by the fact that, after measurements on billions of devices, there have been few failures at the traditional levels.
That raises the question as to whether the standards are too strict. “Nobody can explain exactly why it’s 2 kV. They’re trying to come to a more realistic level,” said de Jong. Back when we were working with 0.5 micron processes, breakdown voltages were higher. “With no ESD, 0.5 micron technology fails at 20 V,” said Keppens. “Today, it’s more like 4 V.” While that makes transistors more vulnerable, it also makes it harder to provide protection as high as what has been the standard.
The “ESD protection window” refers to the gap between the highest operating voltage on a pin and the breakdown voltage. That’s the range that a protection circuit must cover. “The ESD protection window is getting very small” as the breakdown voltages drop, said de Jong.
Because of the lower breakdown voltages, it no longer may be possible to use a single transistor or other component as protection. “It takes more devices to protect against ESD at advanced nodes,” said Mentor’s Hogan. While a chip might have been able to withstand the high voltages on an older process, it can’t on a newer one. As a result, instead of one device, a stack of multiple devices is required. While the overall voltage protection remains the same, that voltage is split up over the multiple devices to keep any of them from breaking down.
Fig. 1: It’s becoming necessary to use multiple ESD protection devices on advanced nodes so that the voltage seen across any single device remains below its breakdown. If the device were linear, then two identical devices would split the voltage, as shown. In practice, these tend to be active devices, so the split may not be in the middle. Source: Bryon Moyer/Semiconductor Engineering
Each of these devices must be able to handle the high currents that an ESD event generates, so each must be large. As a result, the overall circuit consumes a lot of costly silicon.
An ESD event doesn’t know or care what process a device is built on, so it’s hard to reduce voltage specs only for advanced nodes. If a lower voltage is good enough at an advanced node, it also should be good enough on an older node.
Even though the push to lower levels is data-driven, there are well-established companies that already meet higher levels than the new ones being proposed. For competitive reasons, they are loath to reduce their numbers even if the standard changes. “The higher levels are being driven by some customers as a competitive thing,” noted Keppens.
That can put other companies, which might lose a socket for this reason, in a tough position. “If you’re an established company with protection at 4 kV, and a paper says you need only 1 kV, the person selecting the device may still favor the 4-kV part just because it’s a higher number,” said Hogan. So the new, lower standards are taking a long time to catch hold.
Each of the chip-level standards – JEDEC and AEC, for both models – provides testing methodologies and a means of classifying the chips, with a system of ratings for each spec. These are summarized in Table 1. The standards don’t require anything of chips. They merely offer ratings. It’s up to customers to determine what they will require.
Table 1: Ratings levels for ESD robustness according to JEDEC and the AEC, for HBM and CDM. Sources: JS-001, JS-002, AEC-Q100-002, AEC-Q100-11
CDM testing appears relatively fraught, as the industry is finding it difficult to perform the tests with repeatable results. There are also numerous variables that can affect the testing, such as the size of the package. There is ongoing work to improve the situation.
There’s one other category of pin that’s also getting special treatment — analog and RF pins. The capacitive ESD protection circuits can impact the behavior of the signals, making it hard to give them full protection. “RF is the ultimate difficult thing, said de Jong. “One can’t afford any extra capacitance. There’s not much you can do. This is inevitable. Customers understand that.”
Advanced packaging’s impact
This brings us to the chip/system gray area — advanced packaging. Only the largest companies have been building these devices at this early stage, and it hasn’t always gone smoothly. “Some folks have had issues with ESD and latch-up,” said Ansys’ Srinavasan.
The idea with chip-level protection is to make sure the chip survives the system-manufacturing process. The idea behind system-level ESD is to protect the unit for its operational life. For chip-level, all pins are vulnerable. But for system-level, only signals that exit the system are vulnerable.
That sets up a system-level notion where exposed signals need high protection, but internal signals don’t. It can be applied to advanced packaging, where chip-to-chip connections within the package may need lower protection than signals that will exit the package. “With advanced packaging, chip-to-chip signals can have reduced ESD,” said Keppens.
But from the outside the external signals look just like any other chip signal, and they need to survive manufacturing when the package is mounted on a board. So both chip-level and system-level ESD notions apply here.
Still, it’s not quite that simple. Monolithic chips have one post-silicon assembly step, which is when they go onto a board. Chips in advanced packages have two steps — the step where the chip is assembled with other chips into the advanced package, and then the step where the complete advanced package is assembled onto a board.
That second step means that external-facing signals must have standard chip-level protection (or higher). But how can internal signals have lower ESD protection if they still need to survive the package assembly process? Chen noted that advanced packaging occurs within large foundries or packaging houses that have extremely well-controlled environments, which is more than what a board-assembly house might have. So die signals that don’t leave the package can reasonably have a lower ESD level than those that will leave the package.
That doesn’t mean, however, that the internal signals need no protection. While they might not need a full-on solution, Srinavasan noted that they still need secondary CDM protection.
Chen noted one other practical limitation to specifying ESD on internal signals, many of which connect to an interposer through micro-bumps. At this point it’s not possible to test micro-bump signals because they’re too small and too close to each other to isolate an ESD event to a single signal.
Protection circuits are changing
ESD protection circuits have remained consistent for years, but the tried-and-true approaches are no longer as reliable. “Designing ESD circuits is getting more complex due to things like design rules and higher-resistance metal contacts,” said Keppens.
Traditional circuits typically relied on the well-understood snap-back behavior of grounded-gate NMOS (ggNMOS) transistors. But wafer-to-wafer variation is high enough now to where some say that it’s no longer a reliable mechanism for finFET or silicon-on-insulator (SOI) processes. In addition, the failure current is lower and leakage is higher.
According to Keppens, there are two main camps regarding the best way to move forward. One promotes the use of stacked diodes built to handle a higher gate voltage and lots of current. These diodes go from pin to either rail, and they are supported by an active rail-to-rail clamp, giving this approach the name “rail-based.”
The other approach is to use some type of snap-back device like a silicon-controlled rectifier (SCR, also known as a “thyristor”). This approach is called “pad-based.” Keppens said that SCRs can handle the same high currents with a smaller junction area, which results in less leakage. But de Jong said that SCRs are no longer available in the most advanced nodes, and while ggNMOS transistors are still available, they have to be very large to work well. But they are sometimes needed: “Then you can have a ‘fail-safe’ I/O,” de Jong noted, referring to a requirement that some system-level specs impose.
The pad-based approach still has a rail-to-rail clamp, but it doesn’t contribute to the pad protection in the way it does with the rail-based approach.
Fig. 2: Two simplified approaches to ESD protection circuits. The left shows a “pad-based” approach using some type of snap-back device like an SCR or ggNMOS transistor. The right shows a “rail-based” approach using stacked diodes. Source: Bryon Moyer/Semiconductor Engineering
While these circuits can protect either ESD model, there will typically be one circuit sized for HBM and another for CDM. Series resistors may be added in either case to help divide the overall voltage if breakdown is a concern.
Yet another change involves moving protection circuits into the core of the circuit, rather than strictly on the periphery. This may be needed to protect some delicate internal circuitry. “Some protection circuits are going into the die for super-sensitive devices,” noted Jerry Zhao, product management director for multi-physics system analysis at Cadence.
Testing ESD pre-silicon
It is possible to verify ESD protection circuits before silicon is built. The goal of the verification tools is to provide automation for the testing conditions, but that’s a lot easier to do for HBM than it is for CDM. CDM is affected by its environment, so chip and package substrate information are needed for a reasonable simulation. In addition, the peak current happens very quickly, making full transient analysis necessary for true simulation.
Srinavasan said that instead of trying to automate a difficult simulation, a rules check is performed using both physical and electrical rules as a proxy for correct CDM protection. The approach doesn’t apply to the full chip, but when there are special pins (like RF pins) where ESD circuits might affect signal behavior, this analysis can help to find the best balance on a single-pin basis.
The rules check doesn’t give a quantitative answer. However, it points the designer to areas that might need more focus. “It requires a lot of designer and ESD expertise,” said Srinavasan.
Zhao noted a similar approach. “Analysis tools can dump a bunch of details on violations. They can also be seen on the layout,” so that the designer knows specifically where improvements might be needed.
There’s also a trend towards multi-die analysis, which is particularly important for advanced packaging. But it’s early days for that approach. “It’s very much ad hoc right now. There’s no standardization,” said Srinavasan.
Dynamic analysis is also being tried: “Most people do static analysis,” said Zhao. “Some are doing dynamic as well.”
ESD used to be the domain of specialists in departments devoted to that specialty. “People have been looking at [ESD] as someone else’s problem,” said Srinavasan.
But that’s changing. “More and more design teams have ESD experts,” said Zhao.
Hogan agreed. “People are getting more serious about ESD and latch-up. It’s no longer one expert in the company. It’s becoming a shared responsibility.”
A very detailed article on the topic which is critical but hardly any documents on the internet