How to improve the reliability of systems and identify where problems will likely show up.
I have never been very good at introducing topics. During my presentations, I either jump directly into the subject matter, or start with a joke. My mom always said I could be a stand-up comedian. I prefer to sit. Oddly, that led me to becoming an engineer. And while this introduction does not lead me any closer to my actual topic, I presume some people rolled their eyes back and crossed their arms. Excellent. Please proceed reading.
But our topic today is a serious one: IGBT power cycling and lifetime testing. While this topic may not be crucial to which smartphone OS will make your life easier, IGBTs have been shaping our technological landscape for quite some time. Cars, trains, planes, boats, turbines, inverters, power supplies, and even your washing machine, have been receiving some of the “IGBT-love.” Taking my washing machine as an example, notoriously breaking when my laundry basket is full, or while all the clothes are still soaking in cold water, I want to emphasize the topic of reliability.
One of the typical applications for IGBTs is a three-phase inverter circuit. These inverter circuits can vary quite significantly because motors, for example, can revolve at different speeds. A wind turbine turns at 12 to 24 RPM, while motors in an automobile can reach several thousand RPM. This means IGBTs have to withstand loads that range from continuous to short-pulsed. For electronics this can be a quite strenuous activity; like trying to go for a run after Thanksgiving. Your heart beats fast, you start perspiring, and all you want to do is find a nice and cool spot to relax for a second. Essentially that’s all an IGBT wants, as well. The thermo-mechanical stress applied to the device can be fatal and is often associated with many different failure modes like die attach failure, bond wire cracks, and many more. However, as the word “thermo-mechanical” implies, this can be solved through intelligent design, material science and being as cool as possible. It’s high school all over again.
The three main components in an IGBT can be divided into the die attach, die interconnect and DBC/baseplate. All of these are the focus of studies (examples below), where new ideas continue to emerge that also need to be tested and verified.
Considering that thermal problems relate directly to the lifetime of the device, it would be most efficient to test the thermal properties during the life of a component or assembly, right? Of course this is a rhetorical question, because I already have the answer. I am glad you played along though.
We can do exactly this through the use of structure functions. You may first ask what structure functions are? Structure functions are…deep breath… thermal resistance and capacitance networks that describe the heat-flow path from a heat source outward, by measuring a thermal transient (temperature change over time). If you know exactly what I am talking about, then you may go straight to go and collect $200. If not, then you should be happy we are not playing Monopoly.
Essentially structure functions are derived from thermal transient measurements. Thermal transients measure the temperature change over time. Simple. A small silicon layer may cool off faster than a 300kg heat sink, causing the overall system to change very fast initially, and then slower as we move to larger layers. Thus the result looks like an exponential function (Figure 2). Or as I like to call them, “I-hope-my-device-didn’t-get-too-hot-curve,”
By assuming that each material forms a thermal resistance and capacitance layer inside of a system, an equivalent RC-Network can be formed.
Given that each layer has different properties, the overall time-constant τ behaves differently as heat propagates through the system. This transient is later converted to a structure function and can represent our entire system thermally.
Fig. 4 – Structure function showing the thermal resistance and capacitance from the smallest (left) to the largest (right) layers
The interesting aspect is that we now have gathered information on how the heat-flow is actually affected while travelling through the system. This can be even coupled to geometrical information:
Now the mathematician might say that we could make the Area “A” infinitely large, and we wouldn’t have this problem. However, the average engineer is usually happy to understand how it is actually working in the application. Also I would hate to see what cars would look like with big copper plates.
Fig. 5 – Concept Drawing as Area approaches infinity
For measurement-aficionados, the use of structure functions raises two (or possibly more) questions. How exactly does this benefit us during power cycling? And why do we want structure functions? Surpisingly this is a simple answer. In the dark ages of power cycling, pre-2000, it was enough to know the total thermal resistance. As development has become more intricate, and even the smaller layers have to be optimized, this is simply not enough. Is the die attach failing? The bond wire? Or is a bad thermal interface causing the problem? You may pull your hair out. We do not want that. We like your metaphorical hair. Ultimately we want to have a look into the system and find out why it is failing, before it actually fails. Before you comment and say that you have a solution, or a crystal ball, I want to provide a concrete example.
Vce (The collector-emitter Voltage) can rise when the device begins to fail. However, there are two ways the Vce can fail:
1. The bondwires crack, causing a higher electrical resistance thus raising the voltage. Higher voltage also means higher power. In the slightly altered words of Notorious B.I.G., Mo’ Power, Mo’ Heat.
Fig. 6- Bond-wire detach during operation
2. The thermal resistance path increases, causing the device to get hotter. As the device gets hotter, the voltage rises, causing a higher power level. Again, Mo’ Heat, Mo’ Power.
Fig. 7 – Die Attach degeneration over multiple cycles during operation
Essentially, these effects are hard to separate, unless we have a distinct method to do so. I may sound like a broken record, but….structure functions! With structure functions we can determine if the thermal path increases. If it does not, then it is a bond-wire issue, or vice-versa. Yes, yes, all I need is the total increase in thermal resistance. WRONG! How do I know what to fix? Where will I see the actual changes throughout the life cycle, if I do not combine thermal characterization and power cycling? You may want to remove the device, do some S.A.M measurements, or have a different testing station…C’mon. Why go through all the trouble if all the information could be directly at your fingertips?
So what am I actually trying to say? In summary it is not only important to view single effects on the system, especially if the thermal resistance has a direct impact on the lifetime. Additionally, some may argue that newer materials are robust enough to be neglected, but having seen many examples in the industry, this is not the only area that is thermally important. With new developments, different setups and in-situ applications, all of the factors can change from application to application. This change should be a measurable value and not just an expression of if the device lasted one hundred or one million hours.
If you would like to know more about what we do, and see how our solution can solve the aforementioned problems, then feel free to check out my fancy picture and video.
Fig. 8 – Fancy Picture (1500A Power Tester)
Leave a Reply