The industry may have started with the wrong approach for enabling a third-party chiplet ecosystem, but who will step in and fix it?
Experts At The Table: The semiconductor industry has been buzzing with the possibilities surrounding chiplets, but so far this packaging technology has been confined to large semiconductor companies that are vertically integrated. The industry has been attempting to open this up to a broader group of people. To work out what this means for chiplets, and what standardization will be required, Semiconductor Engineering sat down with Elad Alon, CEO Blue Cheetah; Mark Kuemerle, vice president of technology at Marvell; Kevin Yee, senior director of IP and ecosystem marketing at Samsung; Sailesh Kumar, CEO of Baya Systems; and Tanuja Rao, executive director, business development at Synopsys. What follows are excerpts of that discussion, which was held at the Design Automation Conference.
L-R: Samsung’s Yee, Blue Cheetah’s Alon, Baya Systems’ Kumar, Marvell’s Kuemerle, Synopsys’ Rao. Source: Brian Bailey, Semiconductor Engineering
SE: While the industry has been defining interconnect standards, it’s been suggested this approach was flawed and, instead, the industry should have been defining sockets. What does this mean?
Alon: A lot of the early chiplet activity, particularly on the standardization side, was focused on figuring out how to build the die-to-die interconnect in a way that would be standardized, and hopefully interoperable. There’s been a tremendous amount of progress there, both in the OCP ODSA, bunch of wires (BoW), UCIe, and a number of others that are proprietary. It is not to say all that work was not worthwhile. But to an extent, it’s created this perception that the interconnect problem is solved, and so therefore we must be able to go and buy interoperable chiplets. The reality is likely very much the reverse, in that we needed to define the specific sets of applications that we want to go and tackle. Each of them should have a large enough market, in dollars, that it is viable to develop chiplets for them and define the sockets. High-bandwidth memory (HBM) is an example. Functionally, mechanically, and electrically, what are these chiplets that will fit into these systems? Once we’ve done that, the interconnects for them will become, not an afterthought, but they will be well-specified, well-defined. We have this perception about what has been created, and the industry is still working its way through this conundrum of, ‘What do I even build in the first place?’ I can’t build an interoperable chiplet just because interconnect is there. The reality is quite different. We need to define the sockets. That’s not going to be an easy thing to do. Once we’ve defined the sockets, we can solve the interconnect problems for those specific sockets in a way that’s concrete and meaningful. More importantly, people can then build chiplets that have a chance of really being interoperable in those sockets.
Rao: What do you mean when you say socket? You mentioned HBM.
Alon: Socket may be a little bit of abuse of terminology. It’s not literally a socket. You are not going to push it in and pull it out. But it has the right connotation in the sense that it is a concretely defined thing, from a mechanical footprint perspective, from a bump perspective, from a functionality perspective, from a test and features perspective. To that extent, HBM is a memory part. The functionality of the memory part tends to be more constrained than a logic piece. We’re going to have to do this for logic chiplets for this to make sense. It’s going to be much more complex. But HBM is really the best analogy that exists today.
Kuemerle: Because there’s a defined standard, you can interoperate between multiple suppliers. You’re absolutely right that the industry did address it completely backwards. As we’re learning our way through this, we’re finding that every chiplet we can think of has a different link layer requirement, different on-chip bus requirement. We have defined these standard interfaces that only get us a small portion of the way to defining something that we could interoperate with. Can you envision standardizing the footprint for a processor? How do we standardize a footprint for an FPGA? Even for memory, are there different formats of it that we need to think about standardizing as well?
Rao: Will that not complicate stuff? By having something like UCIe or BoW, we’re trying to standardize a die-to-die interface, versus a memory interface, versus an FPGA interface. If we define one for processor and one for memory, then there will be so many standards.
Kuemerle: The challenge is that we have standards defined for the interface itself, for the PHY, or maybe for the PHY and one link layer, of perhaps a dozen that you might want to really implement on a given project. We’ve got the physical interface, and the industry said we get a big checkmark. BoW is defined enough, you can implement it. UCIe is defined. You can implement it. But the real challenge isn’t there. It’s in the link layer. Do you have a 2k-wide AXI bus, a 1k-wide, a 512-wide, a 256-wide. You can use CXL as it is defined for UCIe. If we change our mindset and think about this on an application basis, we might be able to say, ‘This class of thing wants this kind of interface, this kind of link layer, and this kind of on-chip bus.’ And we might have a chance of doing it, until we try to figure out how to make DFT work.
Rao: That is what the UCIe committee is now defining. In the new spec, which may be released by the time this article is published, we’ll have the DFT spec.
Kuemerle: For the interface.
Yee: But exactly to your point, you asked the question, ‘Is this a memory interface or FPGA interface?’ It’s not about the interface. When we talk about sockets, think about it as the application. How am I using it, not what does the interface looks like. And that will define what that link layer needs to be like, or the data adapter layer, or anything else.
Alon: Once you fly this all the way out, it does even touch down on the PHY, because if it’s on an advanced package, how do these pieces fit together? Even under a given specification, where you’d want to implement the PHYs, it’s likely going to be quite different. If you’re in a well with a super dense package and I can run it under 8Gb/s, I am going to build that one way. And that’s going to be very different than if I am in a standard package, and I need to run 32Gb/s, 64Gb/s, whatever it is to just get enough bandwidth.
Kumar: If you look at the system architecture, there’s an application level, there’s a protocol level, and then we have link and PHY. What you’re saying is that we should define the application first and the system architecture. And then, based on that, we need to decide the speeds and feeds needed, what kind of latency and bandwidth for data movement, and based on that we can have different chip-to-chip chiplet architectures. If I look at the evolution of the industry, the die-to-die interface has taken off, and there are a lot of die-to-die standards. But at the same time, the protocol evolution is happening. Today, if you look at Intel, AMD and Nvidia, they have standard compute protocols, such as Nvidia’s NVLink, which is an umbrella protocol. AMD has the Infinity fabric, which is also an umbrella protocol — non-coherent or coherent I/O. And Intel has its own as a protocol jungle. They have IDI, UPI, CXL etc. That evolution has been happening for quite some time. And that’s how the IPs are working, in the sense that IPs work at the protocol level. When a CPU talks with a particular protocol — how that data is moved across the die or the wires — that’s a physical layer implementation issue. The industry is already moving in a way where the protocol evolution, compute evolution, is happening. And in parallel, the chiplet revolution is happening. At some point they have to come together.
Rao: Now it has everything.
Kumar: In reality, no IP vendor supports CXL today. What they did is they went back and said we are going to support CHI, they’re going to support AXI, we are going to support Ethernet. They are adding the protocols on top of that. As long as you have a clean architecture where you have a clean network layer, a clean application layer, a clean protocol layer, a clean PHY layer, and everything is decoupled with the right set of APIs, then the industry can keep moving.
Yee: That’s where I disagree with your statement. You bring up examples saying AMD is doing this, Intel is doing this. That is in a vacuum. It’s all internally vertically integrated. They design that with this very specific application in mind. And they built it around that application. And it’s close to them and no one else.
Kumar: Are you’re referring to the protocol or the PHY?
Yee: Both. That’s exactly the point. They said, ‘We’re going to build this, this is what I need.’ They designed NVLink, they modified NVLink. They modified Infinity or AIB, or whatever, for their application, and they customize their PHY to address the needs.
Rao: But it takes many years to get to a standard where everybody uses the same protocols, the same PHY.
Yee: But that is a problem associated with what we have today. It’s not that we didn’t think about the application first. We saw the industry and we say Nvidia is already doing it, Intel is doing it, AMD is doing it. If I am a small guy and we want to catch up, we are force fed, and we need a standard to do that.
Kumar: The good news is that it has historically always happened. There will be big companies that will do proprietary solutions. And there’s going to be open companies. AMD, Nvidia, Intel, they all have their own proprietary architectures, both physical as well as protocol. But Arm has open standard protocols. If you look at the industry…
Alon: Yes, but if I say this is an AXI bus, or CHI, as you know very well, that’s not a unique definition. That is nowhere near specified enough to get two things to talk to each other.
Kuemerle: How do you make something that is generally able to adapt whatever somebody might have for an AXI configuration on one side, to whatever somebody might require on the other side? It’s not a simple problem.
Rao: We have people doing AXI on both sides, but there is some custom logic to it. It is not plug-and-play.
Alon: This is the point. If you want to get to plug-and-play, this means we will have to have a set of chiplet sockets. Even a processor we define for, let’s say, the automotive socket, may not be the same as the processor defined on the laptop socket or the data center socket. The interconnect for all three of those may also be different.
Kumar: That’s exactly the problem we are solving. When you build these ecosystems, AXI has so many variations. CHI has less, which is good. But the AXI data bus can be 8-bit all the way through a thousand bits. That’s where some configurable platform comes into play. You guys are building the most efficient PHY, but what you transport over those matters, because you have to optimize the link layer to make sure it gets transported efficiently. And that is why we are building a configurable solution, so that all these chiplet architectures can interplay. You have to configure the die-to-die interface based on the traffic, based on the interface properties, and then you can connect those together. As PHYs are evolving, the network has to evolve.
Alon: The end result is certainly that there’s going to be a very broad variety of these die-to-die/ NoC/chiplet implementations. It’s not going to be one chiplet that rules them all.
Roa: It is not like IP where you build one and sell it to everyone.
Alon: In the end there will be 10 such chiplet ecosystems defined.
Rao: Right now, large companies are doing chiplets in-house. The next wave would be maybe two, three, or four companies getting together. It’s a closed ecosystem with a small number of companies. They will work together. There will be some pre-aligned specs that have to happen. We have not seen that. We can see customers who are talking about it.
Yee: It is happening, and people are forming their own ecosystems for interoperability.
Kuemerle: But we’re going to have so many different ecosystems, because it’s all being done under NDA.
Yee: The problem with that is we have standardization that’s not standardized. Everything is a one-off from that standard, which means you don’t have interoperability. That defeats the whole purpose of standardization. The question becomes, how do we solve that problem? We all agree, standardization is a good thing, and it helps us.
Rao: It is needed to go to where we want — an open chiplet marketplace. That will only happen if there is standardization.
Yee: Even as an IP company, I don’t necessarily see it converging. I see it diverging right now with all those customizations. That’s what’s a little bit scary.
Alon: The customization part is not going to go away. And I don’t think it should scare us, simply because it’s a reflection of the reality of building custom SoCs, whether it’s monolithic or disaggregated. If you’re building an SoC, you’re going to build it to the application you need. The way I believe this is going to end up evolving is that the companies or end customers that have large enough volume by dollars will end up being the ones who will have to do the standardization. Meaning that, if you want to play in my world and you want to sell a chiplet to me, this is what it has to look like. This is the functionality. If you build that, I have a checkbook, I will write you a check, and I’ll buy stuff — as long as you’re better than the competitors.
SE: Google created the notion of this secure processor, the Titan, and they put that into the open and said, people go build that and then we can buy what you build. Is that the model that we’re looking for?
Alon: Bingo, that’s exactly it. Some of the folks who are getting together under NDAs and working together behind closed doors are saying these things are going to happen. Once they succeed, and if they are really bought into this open chiplet model, they will say, ‘These are my sockets. If you want to come play, build a chiplet into the socket, I now open the checkbook.’
Rao: And if company X has built this chiplet, which they are selling in this closed ecosystem, they are then going to sell it to everybody else. That’s how it will happen.
Yee: But do you see that happening? OCP with BoW. We saw that happen in the link layer. Someone donated it, and people started using it. At least you had something close to interoperability. Everyone was using the same protocol layer. I haven’t seen that in UCIe. That’s my problem. The momentum is there for UCIe. The maturity is still in BoW, with a cleaner solution right now. I keep on hearing ‘we support the raw streaming mode’ but I haven’t seen a clear spec for it.
Alon: To an extent, this is essentially the answer that I believe I’m trying to pose to this problem. Let’s not even worry about that, because that’s not the big problem. That is a symptom of the approach we took. It’s not the actual problem. If we had said, ‘I want this chiplet with this specific version of AXI and these specific user fields, and it’s talking to this thing over this substrate,’ I know exactly what to build. I don’t have to worry about it. It’s a thing. It’s specified. I build it.
Rao: That’s how it’s going to work. You will build it and you will probably build five different versions of it to enable five different applications and sockets. But once that happens, and there is a common thread between all of them, and you see one chiplet that is common across everybody, then that chiplet becomes a standard offering.
Kumar: You need programmability so that you can re-use it in different environments. That has to be there. But again, if you look how IP has played out over the past 20 years, the chiplet will most likely go the same way. If you look at IP, the IP ecosystem really happened once AMBA became a standard.
Rao: Standardization is needed.
Kumar: Before that, every big company had their own IPs internally, and even small companies had to build their own IPs. Once AMBA became a standard — AXI and CHI protocols, once they became a standard — then the entire IP ecosystem of Arm, Cadence, and Synopsys came together. AMBA was the anchor behind the entire IP ecosystem. At the same time, also keep this in mind. In spite of IP ecosystem being so strong today, AMD, Nvidia, Intel do all of their IPs in-house.
Related Reading
Chiplets: 2023 (EBook)
What chiplets are, what they are being used for today, and what they will be used for in the future.
Leave a Reply