Systems & Design

SPONSOR BLOG

Heterogeneous Computing Model Delivers Order-Of-Magnitude Performance Breakthrough

Using GPUs to accelerate circuit simulation technology.

April 22nd, 2021 - By: Samad Parekh

By Srinivas Kodiyalam (NVIDIA) and Samad Parekh (Synopsys)

With the ever-increasing demand for more computing performance, the HPC industry is moving towards a heterogeneous computing model, where GPUs and CPUs work together to perform general-purpose computing tasks. In this heterogeneous computing model, the GPU serves as an accelerator to the CPU, to offload the CPU and to increase computational efficiency. To exploit this computing model and the massively parallel GPU architecture, the application software will need to be redesigned. Synopsys and NVIDIA engineers have been working together to use GPUs to accelerate circuit simulation technology.

IC design complexity has continued to grow exponentially. Just in the last decade, as process technology has advanced from planar to finFET technology, designs have been hampered by complexities such as very large device counts and interconnect parasitics. For example, comparing a 45nm process technology to a contemporary 5nm technology node, there is around 10x increase each in device counts as well as the number of process, voltage, and temperature corners.

Additionally, as the device and component dimensions decrease, more physical effects need to be considered for accurate simulations. Parasitic effects which may have had a marginal effect in the past have a much more significant impact on overall circuit performance now. During this time, CPU performance gains have largely plateaued while GPU performance has been growing and continues to scale well beyond Moore’s Law. These trends will only further increase the gap between the two computing methodologies over time.

Device model evaluation and matrix solution are the two dominant parts of circuit simulation. Device model evaluation gives rise to a massive number of independent computing tasks. When comparing CPU and GPU architectures, CPUs are designed to handle a wide range of tasks quickly, but they are limited in concurrency. In contrast, GPUs are designed with thousands of process cores running concurrently which increases their throughput performance. Therefore, GPUs have an edge at massively independent parallel tasks. In modern designs with large numbers of transistor counts where each device evaluation is independent of each other, each device instance can be mapped to a GPU thread to run thousands of evaluations in parallel with increasing throughput.

Large post-layout circuits also give rise to large matrices, which require an enormous number of floating-point operations to solve. For example, a matrix dimension of 10M can give rise to more than 100G floating-point operations. CPUs are not built to handle such a large magnitude of floating-point operations, which is another reason for long run-times on simulations. Due to the greatly enhanced computing performance and memory bandwidth in GPUs, it becomes possible to achieve a much more efficient matrix solution utilizing GPUs. The Tesla V100 GPUs, for instance can provide 7 TFLOPS at double-precision.

Synopsys’ PrimeSim Continuum offers a next-generation architecture with unique GPU technology that delivers significant performance improvements needed to perform comprehensive analog and RF design analysis while meeting signoff accuracy requirements. Benchmarks models run on DGX systems with CUDA GPUs show speedups ranging from 4-12X over multi-core CPUs. While performance gains are across the board on various circuit types, the best improvements are seen when running large post-layout simulations. When coupled with long transient run times, the performance improvement is even more noticeable.

PrimeSim achieves its most impressive performance gains by leveraging the massive parallelism of the CUDA GPUs. The core technologies involved are:

synchronous parallel computing on a heterogeneous GPU and CPU architecture
robust sparse solver for solving the circuit simulation system of equations
accurate and efficient IC component modeling
compact and efficient data model and management for the GPU, and
fast circuit simulation database build and data processing

It is increasingly obvious that increasing nano-scale IC simulation complexity necessitates heterogeneous computing with multiple GPUs with extremely fast interconnections.

Srinivas Kodiyalam is Senior Developer Relations Manager for Industrial HPC and AI at NVIDIA.

Samad Parekh is Senior Staff Product Manager for Custom Design and Physical Verification Group at Synopsys.

Samad Parekh

(all posts)
Samad Parekh is a senior product manager in the Circuit Design and TCAD Solutions Group at Synopsys.

Knowledge Centers
Entities, people and technologies explored

Startup Funding: Q1 2025

AI chips and data center communications see big funding; 75 startups raise $2 billion.

by Jesse Allen

Advanced Packaging Fundamentals for Semiconductor Engineers

New SE eBook examines the next phase of semiconductor design, testing, and manufacturing.

by Bryon Moyer

Chip Industry Week in Review

AI export rule to be scrapped; SEMI, EU request; Cadence, Nvidia supercomputer; AI co-processor; Imagination's new GPU; semi sales up; imec, TNO photonics lab; NSF key to national security; flexible packaging control system; SiConic test engineering; USB 4 support; SiC JFETS; magnetic behavior in hematite.

by The SE Staff

Heterogeneous Computing Model Delivers Order-Of-Magnitude Performance Breakthrough

Samad Parekh

Leave a Reply Cancel reply

Technical Papers

Knowledge Centers
Entities, people and technologies explored

Related Articles

Startup Funding: Q1 2025

Advanced Packaging Fundamentals for Semiconductor Engineers

Chip Industry Week in Review

Chip Industry Week in Review

RISC-V’s Increasing Influence

Chip Industry Week in Review

Big Changes Ahead For Interposers And Substrates

What Exactly Are Chiplets And Heterogeneous Integration?

Sponsors

Recent Comments

About

Navigation

Connect With Us

Heterogeneous Computing Model Delivers Order-Of-Magnitude Performance Breakthrough

Samad Parekh

Leave a Reply Cancel reply

Technical Papers

Knowledge Centers Entities, people and technologies explored

Related Articles

Startup Funding: Q1 2025

Advanced Packaging Fundamentals for Semiconductor Engineers

Chip Industry Week in Review

Chip Industry Week in Review

RISC-V’s Increasing Influence

Chip Industry Week in Review

Big Changes Ahead For Interposers And Substrates

What Exactly Are Chiplets And Heterogeneous Integration?

Sponsors

Newsletter Signup

Popular Tags

Recent Comments

About

Navigation

Connect With Us

Knowledge Centers
Entities, people and technologies explored