中文 English

The Gargantuan 5G Chip Challenge

Millimeter-wave technology depends on some complex interactions and new technologies that can affect long-term reliability.


Blazing fast upload and download speeds for cellular data are coming, but making the technology function as expected throughout its expected lifetime is an enormous challenge that will require substantial changes across the entire chip ecosystem.

While sub-6GHz is an evolutionary step from 4G LTE, the real promise of 5G kicks in with millimeter-wave (mmWave) technology. But these higher-frequency signals attenuate much more rapidly, and they are more susceptible to interruption by various types of noise, physical objects such as walls or people, and even environmental conditions such as heat or rain. The solution is to use many more base stations and small cells, with nearly continuous calibration of signals, and to basically “bend” signals around objects through the use of multiple beams from different angles.

This is a mammoth undertaking. It touches every aspect of the 5G ecosystem, starting from the chip/package/board architecture, extending through software development and testing, manufacturing, packaging, and even out into the field. This challenge has set off a scramble in inspection, metrology, and test, with each of those processes becoming more complicated, more expensive, and increasingly critical.

Among the challenges:

  • Test, inspection, and metrology are taking longer as 5G chips become more heterogeneous and as antenna arrays are embedded into advanced packages. There are more insertion points for these processes, and many of those insertions are requiring more time, which in turn is driving up the cost of these chips.
  • Electromagnetic interference, non-linearities, and various types of noise (thermal, phase, power, etc.) have emerged as first-order problems in mmWave devices. The signals themselves are more susceptible to interference, and those effects are magnified as dielectrics become thinner in these chips, many of which are developed at the most advanced process nodes. Those dielectrics ultimately breakdown over longer lifetimes and exposure to the elements, particularly in base stations and small cells. Moreover, because these chips are being crammed into smaller spaces, even EMI from PCBs is becoming a problem.
  • The industry is just beginning to study the integrity of the mmWave signals in urban areas, including the impact of leaves on trees, weather, buildings, and other objects. The problem here is that different frequencies behave differently, and those frequencies can vary from one country to the next, or even within the same country. That makes it far more difficult to model and simulate these devices, and what works in one region may not work as well in another.

“There are some significant inflection points,” said Richard Oxland, product manager at Siemens EDA. “One of those is range of the carrier signal, which tends to be an order of magnitude lower, while the density of signals per square kilometer is much higher. That means you need many more distributed base stations. But it also means you will have multiple network operators sharing a base station, and so you need hardware that supports multiple simultaneous networks.”

That makes it more attractive to develop these chips using more advanced manufacturing processes or some sort of multi-die approach with advanced packaging, because both of those approaches provide additional real estate for more functions and better signal buffering. But it also makes it more difficult to identify a defect — particularly a latent defect that may not show up for years — and to pinpoint the source of a problem when it arises.

Changes to test
Key challenges with mmWave involve what to test, when to test it, and what to do with the data once those tests are completed. That’s complicated by the fact that mmWave testing includes both RF and digital circuitry, new materials — including some developed at the most advanced process nodes — and new packaging approaches.

“We’re testing the chip, but we are also testing the module the chip is in,” said Adrian Kwan, senior business development manager at Advantest. “If you put the chip in an SiP (system in package) or AiP (antenna in package) module, then it’s a whole different kind of system-level testing. On the device side, it’s still pretty much what we’ve been doing for a lot of transceiver tests. But with modules, sometimes it’s ‘go/no-go,’ or AiP testing like error vector magnitude [1] versus antenna distance. With the transceivers, the front-end and the antenna are all integrated into a single module, so there is a totally different set of test cases.”

While there is no shortage of interest in mmWave, the chip industry is just beginning to wrestle with reliability of this technology and quality of service in the real world.

“It’s just the beginning of millimeter wave testing, and right now it’s still pretty low volume,” said Kwan “But it’s probably going to ramp up pretty quickly this year, and the target is to go to over-the-air (OTA) testing toward the end of this year. A lot of this is similar to what we are doing with 4G LTE testing, but now we’re adding phase shifting and beam-forming. And on the OTA side, we will be looking at radiation patterns, which we need to do for the antennas. For example, we can do EIRP (effective isotropic radiated power), plotting power versus frequency.”

Fig. 1: Antenna array module using 12 dual-polarized patch antenna elements and 7 dipole antenna elements. Source: Advantest

Fig. 1: Antenna array module using 12 dual-polarized patch antenna elements and 7 dipole antenna elements. Source: Advantest

OTA testing is a critical element in mmWave, because the spectra allotted to 5G are narrow and the amount of power needed to effectively carry signals from the base station or repeater to the end device and back again can vary.

“There will be at least one additional test insertion that is unique to millimeter wave, which is over-the-air testing, and that is new,” said Jeorge Hurtarte, wireless product marketing strategist at Teradyne. “You’re capturing the radiated signal over the air. So there is zero contact, which is unique. With 4G and 3G, all tests had contact. Now, there will be an antenna that will pick up the signal and then transfer it to the ATE instrument, which can then be tested. That’s the trend, and that trend will be even more predominant when we get to terahertz frequencies with 6G. There will be more antenna elements, the distance for the [test chamber cavity] will be smaller, so OTA will become predominant.”

Fig. 2: mmWave tests for mass production. Source: Teradyne

Fig. 2: mmWave tests for mass production. Source: Teradyne

All of this certainly makes test more complicated, but it also makes it more difficult to determine what to put in hardware and what to put in software, and what to keep in analog versus mixed signal. The tradeoff is that hardware is faster and more power-efficient, but software is more flexible, and analog can be calibrated and adjusted while digital is generally fixed.

“During manufacturing you do a a calibration, you store the calibration tables, and you’ve already characterized what it does over different temperatures,” said Peter Claydon, president of Picocom, which makes 5G baseband SoCs. “You store all that. In RF and analog, that includes things like digital pre-distortion. So if you’re looking at the output of the RF, you feed that back in digitally to show how you’re distorting your waveform, as well as things like the non-linearities in your power amplifier. That’s taking care of other things that are happening in the RF chain. But in digital, you’re just going to have something that potentially fails in a nasty a sort of way, and that comes down to what you’re doing with your EDA tools. You’re running so many more tests, looking at the maximum currents across every single net of the chip. So there’s a whole lot of things that go on in the design process. It used to be that you could finish the design and the layout and you could tape out a week later. Now you finish the design and tape out six months later, because you have so many back-end tests to run and things to alter.”

Fig. 3: 5G small cell SoC components. Source: Picocom

Fig. 3: 5G small cell SoC components. Source: Picocom

Ongoing testing
It doesn’t end there, either. Typically, these are expensive chips, and in the case of small cells, they’re not always easily accessible. So rather than throwing them away if they start behaving oddly, the goal is to keep track of their performance over their projected lifetime and make modifications whenever and wherever possible.

“It’s not going to stop at distribution, and 5G is a great example of this,” said Danielle Baptiste, vice president and general manager of software at Onto Innovation. “There’s this concept with cell phones of ‘phone home.’ So the chips are going to be reporting back. And if we see some unpredictable result, once it’s out in the field, that means we need to start feeding that back into the manufacturing process. We can learn what’s happening when it’s in production, and that’s really compelling.”

That performance can be affected by a number of factors, including many associated with aging, which can cause drift in analog circuits, electromigration in digital devices, and software incompatibilities that accumulate over time with a series of updates.

“After six months in the field, under frigid or baking conditions, something may have a fault,” said Mike McIntyre, director of software product management at Onto. “How do you trace that fault back to the fact that Metal3, for example, was narrow? Factories today measure their lines for factory control, not for analytics purposes. They’re getting maybe 20 or 40 samples over 100 wafers, but that may be 5,000 parts. And so you’ve got 40 samples matched to 5,000 parts, which is a horrible ratio to try to figure out the measurement for that Metal3 line width that created this failure in the field after six months in freezing weather.”

Fig. 4: Connecting process control and analytics throughout manufacturing. Source: Onto Innovation

Fig. 4: Connecting process control and analytics throughout manufacturing. Source: Onto Innovation

One of the consequences of this complexity is that chipmakers are looking to understand what is happening inside a chip/package/board/system at any point in time. Some issues can be identified with built-in self-test, which may kick in whenever a system is booted up. Increasingly, BiST is supplemented with some sort of in-circuit monitoring, which also can be used to alert users to suspicious activity stemming from a security breach.

“You can identify key system-level metrics that are almost like a leading indicator for performance over time,” said Siemens’ Oxland. “For example, you know my average latency on a certain critical connection is increasing over time. If you can build up a picture of what’s normal for an expected operation, then you can determine when you start raising a flag that you need to do a software update. You can collect that data over time, put it into a database, and perform the right kind of analytics on that data.”

This essentially adds new dimensions to testing, and it’s a potentially lucrative new market opportunity for collecting and analyzing data to make sense of what’s happening inside the chip as well as outside of it in the larger system.

“The TAM (total available market) is significant in nearly all the lifecycle stages, from design, test and production,” said Steve Pateras, senior director of marketing and business development at Synopsys. “But by far the biggest opportunity is in the field, because once you get into the field you’re selling to a different audience. It’s a much broader audience than in the early lifecycle stages, with the traditional design, production, engineering, and even system integrators. When you get into the field, it’s pretty much anybody.”

This is especially important for 5G baseband chips, which may serve multiple customers at any point in time under very different conditions. “The SoC can be running software for several operators at the same time,” said Picocom’s Claydon. “It can be in different frequency bands, and you’ve got different customers in different situations. It’s not like one piece of silicon with one software load, where it’s always the same thing. Everyone’s using it in different ways, so it’s important to be able to monitor what’s going on and to have the ability to debug it in the field.”

Inspection and metrology challenges
One of the biggest changes with 5G is in the packaging. In addition to the antennas that are embedded around the package, there are multiple die, which can increase mechanical stress, magnify process variation, and cause age-related issues because not all the die age at the same rate.

“In the past, when you had a single-die package, if the data was good, the package was going to be good,” said Oreste Donzella, executive vice president for KLA’s Electronics, Packaging and Components Group. “Now, when you map all of these dies into a package — a heterogeneous integrated package with 36 dies on top of each other — if one of these dies has a reliability concerns, then the entire package is going to fail. This has huge economic, safety, and reliability implications. When you have a single die with 99% yield, everything is okay. But when you have 36 dies at 99% yield, you have a multiplication factor.”

Because of the higher frequency, there also are more components on the RF side. “We see a stunning growth in filters, power amplifiers, and also in the complexity of these filters and power amplifiers, because now you’re going to set up your filter based on the many many different frequency bands, and the frequency bands are also operating in much larger operating range,” Donzella said. “Everything that has to do with the transmission of the data is becoming more complicated and requires more sophisticated RF devices. And while the RF filters are not 5nm technology, they are becoming more complex. People are using more gallium nitride, gallium arsenide, and other compound semiconductors to build the filters and power amplifiers. These materials are a source of concern because of the maturity of these substrates versus silicon, and because of the complexity of the processes for compound semiconductors.”

The rising cost of these devices also means it makes economic sense to do much more inspection to make sure the individual components, as well as the SoC, package and board, have no serious defects. This makes the use of tools such as atomic force microscopy more viable, both from an equipment standpoint as well as the increased amount of time it takes to perform deeper inspection.

“If you think about traditional AFM, you have a certain region of interest that usually goes up to about 100 microns square, and that’s it,” said Ingo Schmitz, technical marketer at Bruker Nano Surfaces. “But there is really not yet a mindset that you need to look larger indifference to different materials. If you go to your dielectric, with metal in between, optical techniques suffer from material contrast. So a material difference is perceived as a height difference. AFM doesn’t have that problem, and we are combining an AFM with this large area scanning. This is going to become more and more important as we go to 3D packaging, whether that’s hybrid bonding or 3D-IC or 2.5D. There’s going to be huge demand for this, because optically you always have that material sensitivity.”

That rising cost also makes it viable to increase coverage in optical inspection. “The front end never did 100% inspection, because it’s physically impossible to look at every transistor,” said Subodh Kulkarni, CEO of CyberOptics. “At the back end, we did do 100% inspection, but we were at a totally different lens scale and speed. Advanced packaging was somewhere in the middle. Now, the cost has gone up compared to a classic PCB, and the need for inspection has increased. So they want 100% inspection of parts that are extremely small and complex because yields are not that good. They are not getting the classic front-end scaling effect because the volumes are not that high.”

Doing more with data
One of the big changes is that across the flow — from design through manufacturing, packaging, and into the field — there is a recognition that data is increasingly critical to finding defects, determining whether latent defects will turn into in real defects, and in monitoring performance degradation and aging of hardware, software. With mmWave, that also includes ability to “bend” signals around objects to maintain connectivity between devices.

All of this requires more testing, simulation, inspection, metrology, as well as more data analysis and AI/ML to be able to interpret it. “We just worked with Keysight to model an entire city to see where buildings interfere with waves and how far they will go,” said Rich Goldman, director of photonics at Ansys. “We’re also working with NIST to model the interference of trees.” [2]

Fig. 5: 5G signal coverage on the road. Source: Ansys

Fig. 5: 5G signal coverage on the road. Source: Ansys

More studies are underway across the supply chain, as well. There is little doubt that millimeter wave will move into the mainstream over the next several years. But how it will behave in the field over time is not well understood at this point. But tools and methodologies are either in place or under development, and the entire 5G ecosystem is racing to increase its base of knowledge in order to make this transition as seamless as possible. Now the question is how and when all the pieces go together, and so far that remains a bit fuzzy.

[1] How Error Vector Magnitude Measurement Improves Your System-Level Performance
[2] NIST Helps Next-Generation Cell Technology See Past the Greenery

Related Articles
The Search For 5G MmWave Filters
New options abound, but so far there’s no clear winner.
5G Chips Add Test Challenges
Commercial solutions will take time to ramp, but progress is being made.
Security Risks Grow With 5G
Explosion of data in motion raises serious challenges for chipmakers.

Leave a Reply

(Note: This name will be displayed publicly)