Perfection sometimes stands in the way of progress, and there is evidence this may be happening with chiplets. It may be time to slow down and make real progress.
Experts At The Table: Demand for chiplets is growing, but debate continues about whether standards and general-purpose chiplets will kick-start the commercialization boom, or whether success will come through customization of those chiplets. Semiconductor Engineeering sat down to discuss these and other related issues with Elad Alon, CEO of Blue Cheetah; Mark Kuemerle, vice president of technology at Marvell; Kevin Yee, senior director of IP and ecosystem marketing at Samsung; Sailesh Kumar, CEO of Baya Systems; and Tanuja Rao, executive director, business development at Synopsys. What follows are excerpts of that discussion, which was held at the Design Automation Conference. To view part one of this discussion, click here. Part two is here.
SE: There is always an overhead associated with interfaces and standardization. Will the industry accept chiplets that are less than optimized?
Rao: Die-to-die connectivity adds to your power and area overhead. People want to keep it as simple as possible, but the company focus is to make the product work. That is their first priority.
Kumar: If you look at chiplets, there is always a cost. Die-to-die takes power and bandwidth. There are two main drivers. One is cost. You can mix and match, you can have memory in a different process than compute. Second is that it gives you scaling. You can re-use your time. If you look at the AI guys, they are building architectures where you design one chiplet architecture that can scale to 64 chiplets.
Alon: Where the chiplet is reticle-sized.
Kumar: Even if we have one standard that everyone is using for chiplets, it’s not clear to me that the top-dollar companies will use it. They want the extra 10%.
Rao: They have such a large volume that it makes sense.
Kumar: Today, some of those are multi-package. Those are the two main drivers. Talking a little bit more about the disconnect, the protocol standards are there. The way protocol standards will go is that we will always have NVIDIA, Intel, AMD, and the rest of world, which is de facto Arm today. I don’t think there is a disconnect about the protocol. Where is the disconnect?
Yee: You say that. UCIe keeps on saying that. But when you go to the customers, they say the opposite. They say that there’s no protocol…
Rao: If I don’t talk to you at all, and I say, ‘This is my chiplet, this is your chiplet, let’s make it work,’ it won’t work. But if I come to you at the architecture level…
Kuemerle: You have to define them both.
Rao: Exactly.
Kuemerle: There’s no path to the marketplace.
Yee: If that is the case, I don’t even need UCIe because you’re building both sides. I can just connect wires and say it’s going to work. That doesn’t solve our problem in terms of standardization.
Kumar: Let’s say we have a CHI chiplet on this side, and on that side they don’t work. I will bet that most likely it’s because of electrical issues.
Yee: No, I guarantee you that is not the problem.
Kuemerle: What are other reasons?
Yee: To be honest, electrical is the part I worry least about, because today UCIe is well-defined. I’m almost positive that the PHYs are going to talk and be okay.
Kumar: So it’s physical, basically. Is it always physical?
Yee: It is a combination of all.
Kuemerle: Everything together? It’s a huge squirrely mess when you put it all together. The problems are mainly logical, in my opinion.
Alon: It’s the whole picture. It’s a package. It is not a PCB. You can’t just throw anything on it.
Yee: With a PCB, you always know your channels are FR4, or something like it. You can model it, and it is well-defined.
Alon: There is no material system limitation. No chemical limitation. There’s no foundry saying, ‘I can’t possibly take some other foundries content and put this thing in.’ There’s none of that.
Rao: Narrowing the standardization space is not the right thing. People are still exploring what works for them, what works for this application, what works for their product.
Kuemerle: We are not advocating for doing anything with the standardization. I agree that we have plenty of standard interfaces. We almost have enough standard link layers. And we have no clue beyond that. But that’s a whole different story. It’s not a matter of saying, ‘Everybody, stop doing standardization, keep the I/O stuff going on, and everybody is going to need new versions of it. And it’s all wonderful.’ It’s getting to the next level of standardization that we’re missing. I need flexibility to do most of the things I do. But with HBM, it works. With HBM, I know I can put together a product and get all the puzzle pieces to line up logically, physically, from a test perspective.
Rao: But HBM took like 10 years.
Kuemerle: Right. How do we make it work? What does it look like? What are those products? And what happens when we define something as rigidly as we did with HBM? Will it give people the ability to actually differentiate? Memory suppliers do.
Yee: Even after 10 years it is still tough. The problem we have is that chiplets are accelerating. We can’t wait 10 years. Even now, as a foundry, we’ve shifted in the last year and a half, two years, to where we merged our memory, our foundry, our packaging, our design teams together, because we know the end-to-end flow is so hard. Unless you work together, it doesn’t happen.
Alon: The chiplet community as a whole has suffered from ‘the perfect being the enemy of the good.’ Companies will go and define their own sockets, they’ll build their own chiplets for those sockets. If they are motivated properly, they may even say, ‘This is my socket.’ And if they’re big enough, and successful enough, people will build to that socket. There’s a set of startups that are going to compete to try and be the first ones out of the gate. And I suspect the first two or three that are successful will say, ‘Here’s the market I want, here’s the chiplet I want. I’m not going to be differentiating in this, but you guys can, because somebody else has better differentiation there.’ That will start to happen. The bigger dominoes certainly will be when people with larger checkbooks start doing the same thing. But just like with HBM, it wouldn’t have happened had there not been multiple suppliers. The thing that will break that domino is even though they’ll say, ‘I need to customize, I just defined the whole darn thing,’ is they can customize it enough but still have people who actually want to be in that business, who have that focus, and may be very cutthroat on just that thing. That is probably better than what I’m going to be doing for something that’s not my core business.
Rao: That is why all the large companies are involved with UCIe and BOW. They are pushing for that. They obviously are doing more than what they are going to put in the spec, but eventually everybody in the industry will catch up.
Kumar: Especially application companies like Microsoft and Google. They are not in the business of selling chips. They most likely will take the path of least resistance and the path that will lead to the biggest ecosystem. As these companies grow more powerful, as they build more chips, they are going to overrule the NVIDIA and Intel dominance.
Rao: It depends on volume.
Kumar: They have huge volumes. If you look at Amazon, Graviton is a completely standard part. Arm cores, CHI protocol, multi-chiplet architecture. Amazon has said that 30% of AWS deployments are now Graviton.
Rao: But they have not opened up, saying, ‘Anybody want to build this chiplet?’
Kumar: But the main point is they are using a standard architecture. They are also using UCIe.
Alon: Imagine if they said, ‘We want a memory chiplet. We want this exact spec. You build it to this spec. Whoever wins, we’ll buy it.’ People will go build that.
SE: Who should be defining the socket? The end customers or those who understand the physical interface?
Alon: It’s a great question. None of these are easy problems. Will people make mistakes along the way? We’ve made plenty already. But we’re fixing them as we go, because we are all learning. At the end of the day, they are already defining those interfaces. They’re doing it collaboratively with experts. But if they’re going to build that product, they’re going to have to define it, whether they like it or not. They may be leveraging quite a bit from BoW or UCIe. The point is certainly not to say that we throw out all the standardization work that was done. We build on that. To take the next step, we have to start thinking about how we pull in more of these pieces and get people to define specifically, ‘We are doing this.’
Yee: With HBM, it is the people using it who have defined it.
Rao: The product people.
Yee: Going from HBM3 to HBM4, it was JEDEC that defined it. And those are the right people. Do standards organizations always make the right decisions? There’s a lot that goes into that. We’ve announced doing custom HBM. We believe that’s the direction to go. You had DRAM DIMMS and you plug them in. Now you can have chiplet HBMs that just connect in a standardized way. You have a die-to-die with BoW, UCIe, or proprietary. Now I know that I can insert high-capacity memory, and my shoreline is usually better than having 2,000 pins for HBM4. Because my shoreline is better, I can put more HBM in the package than I could before. For a lot of hyperscalers, that is the bottleneck. It’s not compute. They have to throttle back their compute because they don’t have enough memory to drive it.
Rao: Whenever memory vendors come up with an HBM chiplet — and let’s say they use the CXL standard — then it will become a de facto standard. Everybody will start following that and say, for all memory chips, let’s build with that spec. Somebody has to say, ‘A chiplet is available. This is the spec. This is the standard.’ And then people will start lining up for memory chiplets.
Alon: If they had chosen ABC instead, where ABC is an imaginary thing rather than CXL, and if it’s a large enough market, people will build it. That’s what’s been defined. We don’t talk about the HBM PHY spec because it’s just part of that product.
Yee: If you provide a solution, most people will take it because they just don’t have a better solution themselves.
Kuemerle: Even if it is sub-optimal.
Yee: Exactly. Let’s get some kind of solution out that people will adopt, even though it might not be the perfect solution for your particular socket. Once it’s out there, people will start saying that if I did this, this, and this, could I be better than the company I’m competing against? That’s when you start getting that customization.
Rao: That is a chicken-and-egg problem. There is a tape-out cost. Who will bear that tape-out cost before committing to volume?
Alon: That is why I keep steering it in the direction that companies basically will end up defining their own sockets for their own purposes, optimized for their own ways. Once that’s successful, that becomes the point when something exists. I’ll just go use that thing. But if we try and force them now to take your thing, which you don’t even know will succeed, and smash it into this thing that I want you to be — that’s why we’re having so much heartburn. ‘I just gave up on X percent of area, X percent of power, or Y percent of cost. And I’m competing against these huge companies. I can’t do that.’
Kuemerle: The first application has to be something that everybody needs, but nobody really cares about.
Yee: An I/O chiplet. A lot of companies will consider an I/O chiplet as a throwaway.
Rao: But do we want 16 or 32 lanes?
Yee: I understand there’s a lot of variation, but that is the kind of IP no one really wants to build, because it doesn’t really add that much value. If someone else could build it for me, I can focus on my compute side or my accelerator side. You need something to connect to. I have talked to a lot of companies that are building just the accelerator, or building just the compute, or the CPU. They are looking for a partner to build the IO side for me.
Alon: The promise of chiplets, in many ways, was that we can have people just focus on their expertise. The secondary follow-on from that is, ‘The thing that’s not my expertise is not the value, and somebody else should be doing that.’ But you say that’s just commodity. Nobody wants to be commodity.
Kumar: We have the three big compute folks, and then we have all these big hyperscalers. If you look at all of them, they all have a compute-out platform architecture, in the sense that they have their own I/O die, they have their own memory die, sometimes they are combined. They have all moved to an architecture where compute is plug-and-play, where you can plug in CPU dies, you can plug in GPU dies, and so forth. As we get into more and more hyper-optimized architectures, it’s in their interest to open that platform so these small guys don’t have to invest and build a platform. They open the platform, and they open the electrical and protocol interfaces, and then small guys can come in and add their secret sauce. Maybe NVIDIA will not go that way. But it’s in the interest of, for example, Amazon to go that way. NVIDIA is charging a fortune for a single socket today. It’s in their interest to open that so people can come in and innovate on that. I wonder why it’s not happening.?
Yee: If I am NVIDIA, I’m not going to trust just connecting to anything. Why would I open that up?
Kumar: Their platforms have very sophisticated security architectures.
Yee: If you look at what they’ve already done, if they like a technology, they will offer NVLink to that one customer. They’ve done it with MediaTek. But they are not going to open up the whole platform and say, ‘All you little guys start designing, and maybe you’ve designed something that will integrate with my stuff.’
Kumar: Amazon and Microsoft are literally giving money to NVIDIA right now. It’s in their interest. They have the platform, they have the volumes, they have the compute architecture, so they just need to open it up.
Alon: A not-so-hidden motivation of mine is to get this message out there. If those companies finally stood up and said, ‘I really do want chiplets, I really want to be able to go and buy these things and and here’s money on the table, this is my chiplet,’ that would be different.
Kumar: They are in the right position to enable it. And it’s in their interest.
Yee: What is more practical is that these guys are going to create their own micro-ecosystem and say, ‘I’m interested in what you’re doing. Let’s do some NDAs, and you guys start doing things, and I’ll enable things.’ But I don’t see them opening it up and saying, ‘This is my platform, it is very proprietary, has made me tons of money. Here it is, guys.’
Rao: If they are gaining footprint they might do it.
Alon: Somebody could say, ‘I am only opening up this, and I’m not going to tell you anything about anything else. I just want this very exact chiplet for this one very specific thing.’ That’s entirely viable. These are chicken and egg problems. That’s why we all keep harping on the hyperscalers. If a hyperscaler said, ‘This is my memory chiplet,’ there’s enough dollars behind that statement that there’s going to be a large enough fraction of people who will be willing to take the bet.
Rao: Then they go to one memory vendor and say, ‘Build this for me.’
Alon: Let’s say they go to two memory vendors, because they want multi-vendor. And then they say, ‘Maybe I really need three.’ And then you have the same size vendors supplying HBM as we have now. I’ll call that a success.
Rao: That’s happening in some areas.
Yee: AI-enabled HBM because cost was no longer a factor. Ten years ago, when HBM was introduced, everyone wanted to use it. The technology was there, but they just couldn’t afford it. Now the hyperscalers say, ‘I don’t care what the cost is.’ That’s why HBM has exploded. If the cost was down 10 years ago, it would have proliferated a lot faster.
Rao: We are talking about a die-to-die interface standardization. If everybody starts stacking 3D die with hybrid bond, we don’t need die-to-die IP.
Alon: I violently agree.
Rao: So maybe if the cost of stacking 3D and packaging comes down. Maybe that’s the way industry will go.
Kuemerle: Then it becomes even harder to come up with a form factor.
Alon: Take all the complaints we had on the physical side and add a whole new dimension to the problem, and that is 3D.
Yee: I look at UCIe and I see the momentum. We have invested in it, as well. But I also can see that this is the fastest standard adopted and the fastest standard completed in history. If I talk to customers today, it came out with 16 Gig, then 32 Gig, and it is being pushed to 36 to 40 Gig. And now they’re talking about 64 Gig. To this date, I haven’t seen one product that even came out with 16 Gig, so before 16 Gig even came out it is almost dead. Everyone wants 32 or higher. And then when you start talking about 64, it looks like SSR. So why do I need UCIe if it looks like SSR? And if I go beyond 64 Gig, then there’s just a lot of proprietary standards out there. I haven’t seen any of the volume production out there, but the standard already has evolved three generations.
Rao: The standard is moving faster than the product companies.
Alon: Which is kind of scary. That’s exactly why I complain about my own community. Let’s stop pushing on these axes that no one knows you need, and let’s define the product so we can build things.
Yee: Because they don’t know if they need it or not. ‘This is our latest and greatest, so I must need this.’
Read part one and two of the discussion:
Defining The Chiplet Socket
The industry may have started with the wrong approach for enabling a third-party chiplet ecosystem, but who will step in and fix it?
What Comes After HBM For Chiplets
The standard for high-bandwidth memory limits design freedom at many levels, but that is required for interoperability. What freedoms can be taken from other functions to make chiplets possible?
Leave a Reply