Eliminating four key bottlenecks in memory development.
As noted in a recent blog post, demand for more memory is a common theme for many semiconductor-driven products. Artificial intelligence (AI) and machine learning (ML) algorithms rely on fast, plentiful memory for real-time performance, and storage at all levels is key to data-intensive applications. General-purpose memory devices are giving way to customized chips for applications such as AI, servers, and automotive to meet specific performance, power, and bandwidth requirements. The need to produce derivative designs and variants quickly is increasing time-to-market (TTM) pressures.
In response to these demands, memory devices are becoming more complex as well as larger, with very aggressive power, performance, and area (PPA) goals. Memories are increasingly grouped into multi-die configurations such as multi-chip modules (MCMs) and 2.5D/3D structures, posing significant challenges to design, analysis, and packaging. For example, the complete memory array, including the interconnections between the dies, and the power distribution network (PDN), must be considered while designing the most advanced high bandwidth memory (HBM) or 3D NAND flash chips to optimize for PPA and ensure silicon reliability.
Traditional memory design and verification techniques cannot meet the needs of these advanced devices. Simulation of large arrays takes far too long and delays TTM due to excessive turnaround time (TAT). Manual iterative loops when design issues are found late in the process further extend the schedule and consume extra resources. The only way to address this challenge is to “shift left” the memory design and verification process to perform better analysis earlier, avoid surprises late in the flow, and minimize iterations. This shift eliminates the four key bottlenecks in memory development that impact overall TAT and TTM: macro cell characterization, block-design optimization, the pre-layout to post-layout simulation gap, and custom layout design.
Macro cell characterization requires Monte Carlo simulations, which traditionally have been a significant but manageable portion of the analysis stage for memory designs. However, with contemporary devices, the cost in time and resources for brute-force Monte Carlo is prohibitive. Billions of simulation runs are typically required to achieve the desired high sigma characterization and ensure design robustness. Fortunately, this is a domain where ML can make a big difference. Highly accurate surrogate models of the design can be built and trained to predict high sigma circuit behavior, thus greatly reducing the number of runs required. Published case studies have shown that this approach can achieve speedups of 100-1000X over traditional methods while delivering accuracy within 1% of golden SPICE results.
The main iterative loop that prolongs memory project TAT and TTM is having to change the design as the result of the analysis. In the traditional flow, the designer makes decisions on the topology, chooses design parameters such as transistor sizes and R/C values, simulates the design, and examines the output. If the results do not meet the PPA goals for the project, the designer must tune the parameters, re-simulate, and re-evaluate the results. This sets up a manual loop that consumes precious engineering resources and delays the schedule. AI/ML can help here as well by shifting block-level design optimization and the PPA closure process left in time. The AI agent can automatically select device parameters, run simulations, learn from results, and tweak to iteratively converge on the right set of device parameters to ensure that the design meets its specifications. AI-driven design optimization achieves desired goals orders of magnitude more quickly with far less manual effort.
Another major source of iterations that lengthen TAT and increase TTM is the gap between pre-layout and post-layout simulations. The goal is to pre-fetch the impact of parasitics on design specifications such as timing, power, noise, and stability as accurately as possible before layout to avoid unpleasant surprises when parasitics are extracted from the layout. Unfortunately, with traditional flows, such surprises are common, resulting in repeated layout and simulation. The solution is an early parasitic analysis workflow that allows for accurate estimation of net parasitics both for pre-layout and partial-layout designs. Published case studies indicate that using an early parasitic analysis workflow to pre-fetch parasitics reduced the gap between pre-layout and post-layout timing for designs from 20-45% down to 0-20%. Early parasitic analysis workflows can be further improved with the use of ML to predict interconnect parasitics. Although this is an emerging technology, it is one that shows great promise.
Speeding the simulation and analysis of memory designs is clearly key for shifting the process left, but there is opportunity to reduce the time and effort for the custom layout stage as well. The same sub-circuit topologies recur frequently in memory designs. The ability to reuse existing layouts created by expert designers is possible through the creation and application of templates that extract placement and routing patterns. Junior designers can create new layouts from those templates using whatever device size they need, saving time and leveraging the expert wisdom and experience embodied in the original layout. Published case studies have shown that creating and using templates achieves more than 50% faster layout TAT for critical analog circuits in memories and produce more consistent layout quality regardless of the engineers’ experience. The next frontier in layout design is the use of ML techniques to automate analog layout placement and routing, driving further improvements in layout productivity.
The Synopsys Custom Design Family removes all four memory design and verification bottlenecks using the techniques described above. Synopsys PrimeSim Continuum provides ML-driven high sigma Monte Carlo and a unified workflow across best-in-class circuit simulation technologies, eliminating the hassles and modeling inconsistencies inherent in point tool flows. In conjunction with Synopsys PrimeWave Design Environment, PrimeSim Continuum also delivers AI-driven circuit optimization and early parasitic analysis. Finally, the Synopsys Custom Compiler design and layout solution includes full support for template-based design reuse. Memory design and verification are challenging, and getting more so every year. Synopsys provides all the technologies to shift left the process, reducing TAT and TTM while maintaining the quality of the PPA results.
Leave a Reply