Bringing real route information and parasitics to any step in the place-and-route flow.
High-performance computing (HPC) applications require IC designs with maximum performance. However, as process technology advances, achieving high performance has become increasingly challenging. Designers need digital implementation tools and methodologies that can solve the thorny issues in HPC designs, including placement and clock tree challenges.
Placement and clock tree synthesis are critical steps in the physical design of high-performance compute integrated circuits (ICs), and they pose several challenges, including:
Siemens EDA offers Aprisa, a modern physical design implementation solution for hierarchical and block-level designs that addresses all the challenges of HPC ICs (figure 1). It was built from the ground up with a detail-route-centric architecture that reduces the time to design closure. Its unified data model is shared throughout the entire flow, bringing real route information and parasitics to any engine and any step in the place-and-route flow. This allows for consistent timing and DRCs across engines, which translates into excellent correlation with signoff tools and reduction of the number of ECOs.
Fig. 1: Aprisa architecture, the detail-route-centric digital implementation solution for fast design closure.
Aprisa optimizes placement with advanced knowledge of the detail routing, which minimizes the need for manual guidance from designers. By reducing or eliminating place guides altogether, designers can quickly achieve optimal placement without having to rely on experience and full-flow pre-runs. This early insight into timing, area, power, and congestion is extremely valuable for saving time and cost, especially as design cycles increase due to the complexity and size of today’s ICs.
At advanced nodes, the cost of adding each additional metal layer to a chip increases dramatically, often exceeding millions of dollars. Therefore, it is crucial for chip designers to carefully weigh the benefits of saving power versus the added cost of additional metal layers.
Fig. 2: Arm core A76 frequency vs. use of metal layers tradeoff.
Aprisa’s ability to provide early insight into PPA metrics allows designers to fine tune their options for optimizing performance vs. power vs. cost, very early in the flow. Sacrificing a little power savings, for example, helps them reduce manufacturing costs and achieve the desired performance. These opportunities to study metrics while keeping their time-to-market goals, empower designers to make informed decisions about tradeoffs that impact their bottom line.
Among other top challenges in HPC designs is clock tree synthesis (CTS). CTS is a critical step in the physical design process, as it determines the final timing of the design. CTS involves routing the source of the clock to all the sinks, including registers, latches, clock gates, and macro clock pins. A low-quality CTS can result in poor timing, high power consumption, and poor signal integrity.
The three primary factors that impact CTS are:
Aprisa supports useful skew starting at placement and continuing all the way to route optimization. This ensures that the challenging frequency targets of HPC designs are met. A strength of Aprisa’s CTS technology is that the push and pull offsets generated during placement optimization are realized during clock tree implementation.
Another significant advantage to CTS is the ability to merge/de-merge of multi-bit flip-flops and clone/declone of integrated clock gates (ICGs) based on the timing, physical location of cells, and criticality of the paths. Because Aprisa understands CTS starting at the placement optimization stage, it can produce the most optimal clock tree and reduce clock power.
After CTS, Aprisa will recover congestion created during CTS without impacting timing. With traditional tools, designers would have to iterate back to placement optimization to reduce congestion.
During post CTS and route optimization, Aprisa can apply useful skew to further improve the timing and achieve excellent correlation.
Multi-point is the most popular approach for HPC designs. It offers better clock skews than single-point and uses less power than a clock mesh. A multi-point clock structure splits the design into partitions, with each partition connected to an anchor buffer from the top-level design.
Aprisa automatically creates partitions based on anchor cells and it factors metrics like timing, physical proximity, and load on each anchor cell. Designers can also use Aprisa to automatically generate the anchor points, which can save a significant amount of time and effort in the design process.
Aprisa addresses all of the implementation challenges for HPC designs at advanced nodes using easy to deploy out-of-the-box reference flows. It provides industry-leading correlation to signoff tools while providing a number of technologies that reduce the number of ECO iterations. It ensures all PPA metrics are carefully balanced for HPC design implementation through high-quality clock trees, placement and patented routing technologies that reduce timing closure friction between the block and top-level during assembly.
With the help of the right place-and-route tool, designers can bring their HPC design innovations to market faster with fewer engineering and compute resources.
Leave a Reply