Experts at the Table, part 3: The world of memories is changing rapidly but it is not yet clear which new approaches will become mainstream. System houses are looking at re-architecting the memory system.
Semiconductor Engineering sat down with a panel of experts to find out what is happening in world of memories. Taking part in the discussion are , chief executive officer at Kilopass Technology; Navraj Nandra, senior director of marketing for Analog/Mixed signal IP, embedded memories and logic libraries at Synopsys; Scott Jacobson, business development within sales and marketing at Cadence; and Frank Ferro, senior director for product development in the interfaces and memory division of Rambus. What follows are excerpts of that conversation.
In part one, the panelists talked about the bifurcation of memory requirements between mobile and desktop and some of the approaches needed to tackle them. In part two, some of the new memories and interface technologies are discussed.
SE: While memory compilers are good at building a memory, what about tools that help a system guy decide what memories are required?
Nandra: That is another level of sophistication that has not been addressed. These are tradeoffs in the overall SoC design. It has a lot of impact on how efficient you can make your software. You could get away with less memory if your software used memory more efficiently. There are some early discussions about this.
Jacobson: From the system point of view there is another layer that is being looked at – not just the memory performance but the interface performance. When thinking about tools, it is not enough to just be looking at the memory, but the memory plus the interconnect. Why does my software not respond quicker?
Ferro: From a DRAM perspective this is true.
Cheng: The industry does need a tool to make architectural tradeoffs. It is quite fascinating that it is not more developed. This is not a recent discussion and it is not just driven by power. There have always been issue of not enough performance or bandwidth – architectural decisions are always in the forefront of chip design. The problem is that it does not generate enough revenue. Architectural tools also marry chip and software and this is outside of the core EDA business. It is a fascinating mathematical problem and a dreamland for computer scientists. Today, the latency and bandwidth of a system is limited by the wake-up time of the memory. But if you don’t go to sleep, you don’t save enough power.
Ferro: The goal with a system is to keep all of the beasts fed. We tend to rely on process nodes to solve the problems. But not anymore. The effort in architecture will be rewarded more in the future. We are moving into a time when architects will be forced to make better decisions.
Nandra: Eventually, you will be able to build heuristics into these tools based on the chip designer’s knowledge. This is the way many EDA tools are constructed. That is why I believe these are early discussions.
Cheng: I would argue that the architectural decision is a convergence problem. It is not NP complete, so you don’t need heuristics. The problem is that it is so semiconductor specific and requires too much data.
Nandra: But that is how you get the data down, by having some empirical or heuristics methods.
Jacobson: If you look at the history of compiler technologies, starting from the silicon compiler days to present day, we have always had this process marching forward based on Moore’s law and assuming that the next technology node would provide wonderful improvements, so this was the obvious way forward. It was predictable. Now, it has become foggy. Memory technology architectural differences could affect the overall performance and they may be at different nodes. The silicon choice may even back up. So now, when you make architectural choices and try and put them in a tool, it goes from something being narrow to something very different. How do you control what you want to optimize when your field of constraints are going out in multiple directions.
SE: Will Wide I/O also be a constant march forward?
Jacobson: We went through a design effort with TSMC using their CoWoS (Chip-On-Wafer-On-Substrate) approach utilizing Wide I/O. We followed the same path that a customer would have taken. It opened our eyes to a lot of issues that would be faced by people going to this kind of technology. Wide IO was caught in a technology flux point in that it did not provide enough performance increase to merit the risk of a whole new technology involving 2.5D processing. Wide I/O2 had to step out and give a performance boost in order to make it a viable economic tradeoff. Will it be enough? That is still a tough call to make.
Ferro: Wide I/O was marching forward very nicely when there was a big company in Texas driving it, but when they disappeared from the application processor market, everything came to a crashing halt. But you still need a company that will take ownership of it and drive the manufacturing cost down. Until that happens…
Nandra: We decided not to get into Wide I/O and I think it was the right thing to have done. I have been asked if we should put effort into Wide IO2 and today I would say that LPDDR4 is eating into the Wide IO2 market. Wide IO2 may end up in the same place as Wide IO and without a driver it is even more likely. Wide IO3 may be different.
Jacobson: This is the same point that the designers are facing right now. If LPDDR can keep up with performance why go to Wide I/O2? But that is not the whole answer. There is a cost of implementing it in a system and there must be a driver to make it affordable. If you go with a higher performance serial approach or a lower performance wider parallel approach can you, for example, manage your signal integrity better, can you manage system implementation costs – that is as much a driver as how fast each technology can get you if they are equivalent. But now we have the luxury of asking which one, in my system implementation, will get me the lowest costs point. The same is true for power and yield.
Cheng: The whole semiconductor industry is about solving tough engineering problems. Solving the signal integrity problem is lower cost than putting 512 bumps on a die. The genesis of the Wide I/O cost story is about using a legacy DRAM. Once the cost of new DRAM technologies is driven down by the PC and server market, it makes the old technologies so much more expensive in a comparative sense, that the fabs are obsolete and they are converting to driver, sensor and high-voltage fabs. This makes it difficult for Wide IO to have a place. The low-cost feature phones may not need Wide IO2 either and instead go with LPDDR.
SE: What will change in the next 12 to 18 months?
Nandra: I expect a huge adoption of LPDDR4 3200 Mb/S. For embedded memory we are seeing interest in getting down to 10nm and for some applications, such as automotive, building reliability into memory compilers. Wearable and NFC are looking for integrated technology and higher densities that will encroach on embedded flash. Lots of interesting innovations that I think we will see in this timeframe.
Ferro: After MemCon I wrote a blog that said that LPDDR4 is here to stay for a while. The transition to LPDDR4 is interesting in that it is happening quickly and the cost point will keep it to the highest end phones and others will remain with DDR3 for a while. Customers are asking about high speed chip to chip interfaces around HMC and small SerDes.
Nandra: With HMC you can get away with 12G SerDes and you don’t need so much of the equalization technology and HMC2 is in the region of 18-20G.
Jacobson: In the big iron days, nobody got fired for buying IBM. If you stick with a plan for DDR4 and LPDDR4, especially looking at risk management, this will be a driving force. We are still early into DDR4 and these will continue. There are clear drivers that are beyond the bandwidth that can be reached by these approaches in networking, graphics, high performance computing etc. These are pulling the market into transitioning to 2.5 and 3D memory solutions. They need more out of it than they are seeing from the best view of DDR approaches. But there is a big gap at the moment – can the technology be matured and the economics and risk brought into place to answer the needs of those customers. There are customers who currently have unmet needs. They need more. If there is an answer that can meet their needs, they will be all over it. There is likely to be alignment in the industry driven by different industry segments.
Cheng: The industry will innovate around 28nm and there will be multiple new memory technologies. Most high performance, high bandwidth customers are not able to afford 20, 16 and 10nm. So what happens in 10, does not apply to everyone. They need some new technologies. Many analog/mixed-signal companies will move to 28nm so this will be the exciting point for memory. This is where everyone will congregate.
“There have always been issue of not enough performance or bandwidth – architectural decisions are always in the forefront of chip design. The problem is that it does not generate enough revenue. Architectural tools also marry chip and software and this is outside of the core EDA business.”
This excerpt from Cheng is IMO the most important statement in the series and quite frankly one of the most pressing issues in the industry. Architectural tools (generally, not just for memory) are critical to the future of the industry yet design companies are left expending resources and time implementing their own inferior tools to try and deal with the issue. While this may have been considered a competitive advantage in the past it significantly hinders progress today.
If it does not generate revenue then somebody needs to step up to the plate and start an industry supported initiative.