Advanced Packaging Shifts Design Focus To System Level

Partitioning and floor-planning become big challenges. What goes on which die?


Growing momentum for advanced packaging is shifting design from a die-centric focus toward integrated systems with multiple die, but it’s also straining some EDA tools and methodologies and creating gaps in areas where none existed.

These changes are causing churn in unexpected areas. For some chip companies, this has resulted in a slowdown in hiring of ASIC designers and an uptick in new jobs for board designers and system integrators, because many of these packages are more akin to small PCBs than ASICs. With chiplets/tiles, for example, integration experts can string together tiny surface mount components, which are then wired onto passive interconnect structures. The result is more of a board-like operation than the traditional ASIC challenge of designing with standard cells, performing static timing and other activities.

There are a variety of impacts on the design tools needed to make all of this work, as well. EDA companies have been striving to keep them hidden from the designer, but the growing number of packaging options and widespread demand for customization have created the need for new capabilities in the tools.

“Silicon interposer-based design is not dramatically different from what we do today for an SoC-style design,” said John Ferguson, director of product management at Siemens Digital Industries Software. “In an SoC, there are pre-determined blocks and IP and other things that either have been created or are in the process of being created to some spec, and you’re going to route them together. The router doesn’t really need to know all the details. It just needs to know the size of the blocks, where the pins are, and what the pins are supposed to connect to. The tools take it from there. A silicon interposer takes that top level routing and puts it on another die. It does give an advantage in that today, if you’re doing an SoC, all the blocks have to be manufactured on the same process. Here they can be separate blocks, have separate processes per block. This means if it’s a chiplet style, it’s really not that different.”

Theoretically, each chiplet can be manufactured using a different process, and the routing can be planned as if it was one die. That approach works best if they are isolated in different sections of the package, and in this sense the design flow is not so different. But distance still impacts performance and power, which is why many of the new package technologies are increasingly dense. As a result, much greater attention is required for floor-planning and partitioning.

“If you’re planning a new design targeting a 2.5D or 3D implementation, how do you determine where to partition things into different die?” asked Ferguson. “How do you determine what process is best per die? There aren’t great answers for that today. Everybody is working toward this. It’s a bit easier if everything is just two levels. The more levels of freedom you have, the more variables, and it’s really quite complicated to try to figure out which one is optimal.”

Each of those choices has an impact on power analysis, thermal effects, stress, and a growing list of multi-physics challenges.

“On a single die, from a stress perspective, we know how to figure that out because every transistor is going to just be dependent upon what’s in its local neighborhood within the die. But if you have to think about stacking things on top or surrounding it, or the die is being bent because it has more weight on one end than another, you have to know all that information beforehand to figure out if that die is actually going to work,” Ferguson explained. “Then you add in the thermal impacts that will cause additional stresses. How to account for that, along with how things are going to be changing over time as different components are being run, and exercised is a difficult challenge.”

The impact on design teams can be significant, depending upon the complexity of the package and how customized the design is. “We are talking about the folks who’ve been doing the 2D SoC design, and it’s their world which is getting changed because now it is being disaggregated for cost or yield or scalability or heterogeneous considerations,” said Shekhar Kapoor, senior director of marketing at Synopsys. “This means these designers must now contend with unique issues coming into play. It’s basically a silicon designer’s problem now. Historically, it has been looked at it from the point of view of just putting modules together into an MCM type of design, which is still a packaging problem. Now, it is a silicon designer’s problem.”

The first big challenge is to determine what changes from a monolithic die design to a multi-chip/chiplet package. What’s the best way to split it up? Will it utilize a fan-out, system-in-package, 2.5D or 3D-IC architecture?

“Sometimes there are some internal decisions like how to split the modem versus logic CPU, for example,” Kapoor said. “How should that be done? What are the requirements? The performance bottleneck has always been between, say, the CPU and memory. This gives designers an opportunity to bring it together in a package,” he explained. “Based on the intent, architectural exploration is where lots of new effort is coming in. In an SoC world, you had a pretty good idea, and you would start at the floor planning. Now, architecture exploration and die partitioning up front is becoming a critical new piece, and that brings in scalability, complexity problems, and also feasibility problems.”

Other considerations are driving the imperative for system analysis. “How to deal with the increased complexity in terms of now having not just a few thousand bumps, but now having potentially hundreds of thousands, or millions of bumps, with TSVs — all of this has to be dealt with and floor-planned around,” he said. “These chiplets could be in different groups. These partitions could be handled by different groups. But they now have to look at the top-level constraints. They have to understand how the bumps and the TSVs are going to be placed, even when they are in the 2D level. They’re doing their floor-planning and placement. And from a 3D level point of view, this has huge implications. After you’ve decided on your architecture, how are these going to be placed for the end goals for performance and power. This is why this critical element of system analysis is coming into place.”

Fig. 1: Analysis-driven design for multiple die. Source: Synopsys

Fig. 1: Analysis-driven design for multiple die. Source: Synopsys

What changes, what doesn’t.
Not every part of a design will see significant change, though. Some parts will change a lot, while others will be affected only minimally.

“I expect exactly no change for the day-to-day digital designer with writing RTL for 2.5D or 3D designs,” said Aleksandar Mijatovic, digital design manager at Vtool. “For designs today, we’re typically using Verilog as the hardware description language, which is from 1995, and that standard is still valid. The industry has passed through many technologies, and the tools have adapted to translate the same old functional code to new technologies and more or less, and no differences were seen [by the designer].”

The biggest impact he expects is that some coding guidelines get changed. “The heavy lifting will fall to EDA vendors, which need to do full optimization of good old code to brand new technology or production, as they have been doing for all these years. At the end, what digital designer does is coding the tool, getting a netlist. Plus, 90% of designers have never opened a netlist to see what is inside, and may never have seen a wafer or mask. They are simply focusing on functionality.”

To be sure, the companies that have decided on advanced packaging are making it happen. But they are doing those designs with a lot of customization and manual techniques, which indicates more work needs to be done to automate more of the flow.

The basic technologies and algorithms, including place-and-route and analysis for multi-chip packages exist and can be leveraged today. “You can analyze a chip, you can analyze an interposer, and you can analyze a PCB,” said Marc Swinnen, product marketing director at Ansys. “You can design each of those. However, some of the missing tools are the floor-planning and assembly tools, and that’s a significant difference. Right at the beginning when you are floor-planning a multi-die environment, you need some way to break up the designs. Say Block A goes on this chip, and Block B goes on that chip. You need to be able to see more than one die on your screen.”

Just dealing with a single die, rather than the interactions between different dies, is insufficient. “In today’s 3D IC implementation tools, you can see the multiple dies and you can see how they overlap because if you stack them on top of each other,” Swinnen said. “It’s not that you have to go across from one block to the other, you can go straight up from one block to the other, so the distance is actually not that far. Then you have to worry about the mating of the bumps. The bumps on one have to align with the bumps on the other, and there’s checking involved in that. So in the floor-planning stages, there’s a lot of change that needs to happen.”

There’s a push today toward fine-grained distribution, once the design is floor-planned. “Today, most distribution is coarse-grained, whereby the design is broken up at the block edges and you put a block on chip A or B,” he said. “You don’t take a block and distribute the transistors across two chips. Technically, fine-grained distribution would be feasible, and is something the military and others are interested in but that’s future music right now. Today it’s all done coarse-grained, which means you floor-plan the blocks across the chips, then you can design each chip more or less independently once the floor plan is complete. Then, it’s just a matter of getting this block placed and routed on Chip A, and this block placed and routed on Chip B.”

That’s easier said than done, because there are challenges in routing between blocks, as well, even where there is an interposer. “For quite a while, place and route tools have had redistribution layer routers, which are specific analog-type routers. Those routers integrate into that environment with some modifications because of the Z axis that’s added,” said Swinnen. “That’s a bit of a difference. The extractors don’t have a problem like RC extraction. The extraction of an interposer exists, and is relatively standard. But then as you move up and down the 3D stack, or the TSVs that go through the chip, that all must be extracted accurately. Once the electrical equivalent is extracted, the timing is pretty standard. The timer doesn’t care whether it just has an RC network. It solves and that’s the end of it. When you look at DRC, that, of course, requires a Z-axis-aware extractor to check the geometries on everything.”

Swinnen doesn’t believe brand new algorithms need to be invented necessarily for 2.5D or 3D-IC designs. “There are some aspects, like low power frequency oscillations, which is something a single chip would never see. This is known from the PCB world, so it can be borrowed from that side. 3D IC design is more about creating an environment where the Z axis is a natural part of it. The floor planning is that one that has taken a big hit, and there has to be a lot of change there to floor plan it out. Another part is that pulling in all these various aspects like power distribution, power integrity effects not seen before. Thermal effects are probably the limiting factor on 3D designs, so you need to worry about that at the floor planning stage. And thermal, which was an afterthought for chip designs previously, suddenly is right up front, being one of the primary floor planning considerations.”

So realistically, is 3D IC the answer to complexity? There are some chip designers who believe that by going to the 3D IC and multichip packaging that their lives are going to get easier. Most do not.

“It absolutely does not get easier,” said John Park, product management director for IC packaging and cross-platform solutions at Cadence. “It gets exponentially more complex. All of the things you have to do when designing an SoC are still there because you’re still designing something, probably at an advanced node, that’s going to be part of a multi-chiplet solution. But now you have to account for all of these system-level challenges that come into play, like power at the system level. As soon as you go to any kind of 3D stack, thermal is probably the very first thing that people need to take a look at. Also, signal integrity is something we’ve done in the board world for the last 30 years to validate the electrical connection between the chips or the board level parts. This now comes into play because the interfaces between chiplets need to be electrically validated for signal integrity.”

Mechanical stress also comes into play in all packages, but with stacked die it adds a whole other set of problems. Multiple devices are being mounted sometimes on not-so-rigid carriers, and that means additional tools are needed in the flow.

Fig. 2: Memory on logic flow. Source: Cadence

Fig. 2: Memory on logic flow. Source: Cadence

Does the designer sit down at a single tool and do all this? How many tools does the designer have to work across?

The design engineer still needs all the tools they had in the past for full-chip level design methodologies, and now they must plug in system thermal, system signal integrity, and mechanical stress tools, among others. “The bottom line is, as soon as you move from a single chip to multiple chips, you have to have a top-level planning and optimization tool that allows the netlist to be figured out that ties Chiplet A to Chiplet B, even if it’s not a chiplet,” Park said. “If it’s a tile, a netlist is still needed that says the logic on this bottom tile ties to the memory, and to the SRAM above it, and this is the netlist to connect it. If you’re an IC designer, you’ve got your IC methodology, so the top-level planning and optimization tool is the first one that should be added to the flow. Then, a top-level design aggregation platform is needed to pull all of this together, and to come up with the top-level netlist. More and more people look out beyond just their multiple chips. They look, in some cases, where the end product has a very restricted form factor, meaning the PCB is going in a smartwatch or something like that. They’re doing the planning of the PCB, the packaging, and then the chiplets all in a single pass because it’s all pretty tightly coupled together,” he explained.

This can be daunting to designers, who can barely can keep up with the tools they have. Now they need to add more tools into their flow.

On the power analysis tool side, Ferguson sees them more or less in place. “It’s not dramatically different. We’ve been working on stress and thermal for some time, but until now we haven’t seen people saying this is really as important. That has started to change. There are some companies that have built complicated 2.5D or partially 3D designs that had failures, and ultimately they traced those failures back to thermal and stress problems. There are ways to solve that, but it’s just a matter of how it gets rolled out, as well as how to combine it in the tool flow so that the power drives the thermal, which ultimately impacts the stress, accounting for the fact that stress also impacts power. It is an iterative loop, and how to optimize it is rather tricky.”

This is even more tricky because designers today are not accustomed to doing thermal and stress analysis on their designs. “Package designers historically have done thermal analysis, but it’s challenging to take a thermal solution and push it into more of a transistor-level detailed analysis,” Ferguson said. “That’s where the challenges come in. Also, who should do what? Is it the package team? Is it the chip team? Can we bifurcate it? Everybody has a different approach and a different opinion. That will coalesce eventually, but it’s not ready yet.”

Finally, a significant issue Park doesn’t see being discussed is in the area of bumpless stacking. “People don’t use chiplets. In the world of bumpless stacking, which is mostly memory-on-logic stacking today, those aren’t chiplets. Some people are starting to use the word tiles to describe it, where you’re just stacking some additional logic. In this case, you’re stacking memory on logic, and there is no I/O buffer separating them. This means when the integration is done, static timing is needed. You also need to do flop-to-flop timing verification within that stack, which is not something you need to do when you’re working with multiple chiplets because the timing is closed out to the wrapper or the pad ring that surrounds that chip or chiplet. Static timing plays a huge role. Going forward, there will continue to be options for 3D packaging technologies, but we will also need to have new 3D integration design tools and methodologies.”

Given the potentially huge number of variables and new approaches to analysis, planning and optimization of multi-chip designs, tool vendors are working toward unified cockpits that allow for multi-chip and chiplet design planning, implementation, and system analysis. Ultimately, this should make it easier for designers to focus on design instead of the integration of point tools. Some of these approaches are coming to market, but many more are both needed and expected.

Leave a Reply

(Note: This name will be displayed publicly)