Margin-driven methodology results in sub-optimal designs, but looking at design data holistically can help.
Advancements in silicon process technologies are enabling companies to deliver products with faster performance, lower power and greater functionality. These benefits are especially attractive for chip manufacturers servicing markets such as high-end mobile and enterprise computing. However, the cost in terms of both dollars and resources associated with bringing 7-nanometer (nm) finFET-based system-on-chip (SoC) to market is orders of magnitude higher than with previous generations. At the same time, competitive pressures demand a same or even shorter production schedule.
In order to successfully leverage and profit from the use of 7nm technology, design teams need to increase productivity by achieving faster design convergence. They need to resolve the fundamental breakdowns in the traditional silo-based approach.
Margin-based approaches increase cost, makes convergence difficult
FinFET devices switch much faster creating sharper switching currents. The increasing design density causes higher localized power surges. These, along with higher power delivery network (PDN) parasitics across chip, package and PCB, can result in considerable dynamic voltage swings. For a 7nm design with 500mV supply, such fluctuations can be as high as 25-30% of the nominal supply, which is significantly greater than specified available margin. This makes it nearly impossible for design teams to meet the required threshold without increasing the power grid, uniformly across the chip, leading to timing congestion, routing bottlenecks and increasing chip size.
Yet over-designing does not guarantee that the chip will function properly since current tools can only simulate a very small fraction of all the operating modes. To compensate for this major simulation coverage hole, engineers create artificial operating conditions that exaggerate the power envelope and di/dt conditions. This creates significantly higher power ground noise that even an over-designed chip will not be able to meet its threshold requirements. As a result, it will become difficult to reach design convergence and cause considerable delay on production schedule.
To manage the growing design complexity challenges, the traditional approach has been to partition the design or design targets (timing, power, routing, IP, chip, package, PCB) into smaller, manageable blocks. This has been primarily done to work around the limitations of existing tools and methodologies that cannot incorporate and analyze results from multiple domains given their older architectures and data models. Even with the addition of distributed processing capabilities, these 20+ years old technology cannot effectively scale performance across multiple machines. To compensate for lack of visibility into the multi-domain interactions, design teams again add margin to quantify the impact of one on another.
This margin-driven methodology results in sub-optimal designs since each block or design target is optimized using only the best- and worst-case values from the other blocks, for a single point in time.
Multi-domain, multi-physics co-optimization needed for holistic view of overall design data
Consider a multi-physics optimization methodology in which you start not with pre-defined margins but from an under-designed state with possible violations. Then simulate the design using realistic conditions, across multiple scenarios and multiple physical effects. By using advanced analytics you can collate the results from these multi-scenario simulations to gain meaningful insights and then translate them into changes that address the most likely failure points. This ultimately results in an optimized and less congested chip, allowing for faster convergence while mitigating failure risks.
To be successful with this methodology requires a multi-domain, multi-physics co-optimization approach that allows you to holistically look at the overall design data. But given the limitations of existing EDA tools, a fresh approach leveraging advances in computer science in big data and analytics is needed to break these design silos. Additionally, this approach needs to be design and simulation flow independent to be able to work seamlessly across multiple different tool environments using industry standard interfaces and data models.
Elastic scalability key to faster design convergence
The simulation turn-around time for complex billion+ instance designs needs to reduce significantly to be able to perform multi-scenario multi-domain coverage. To achieve this goal, the software architecture needs to leverage the increasing number of available CPU cores and scale elastically across the compute infrastructure depending on the size and stage of the design. It should be able to harness the increasing availability of large number of CPU cores in a flexible and scalable manner to limit simulation times to few hours for the most complex designs. Unlike existing solutions, it should not require dedicated machines for which you have to wait hours but should be able to leverage CPU cores that are unused and immediately available within the network (for example, if your timing simulation is performed on a 512GB machine but is only using 16 of the 20 available cores, you should be able to leverage the remaining cores for other tasks without affecting the timing simulation). As design sizes get larger or the number of simulated domains increase, the capacity and performance has to scale elastically within the network.
Effective design decisions arise from meaningful analysis feedbacks
Being able to quickly simulate large databases of different kinds of data alone will not improve overall productivity, especially if the design is still being overdesigned. To reach convergence, you need to gain meaningful insights that allow you to make decisions that fix real hot spots in the design. By using techniques that are proven in other applications like MapReduce and through user-friendly Python-based interfaces, data that are relevant to the task at hand should be quickly accessible. This allows you to make effective design choices and take action while the design is evolving. Actionable analytics with powerful API and GUI provides fast access to different database and distributed platform.
Flexible environment needed for real-time, multi-site collaboration
A flexible design and debug environment makes real-time multi-site collaboration possible. You and your colleagues should be able to view, analyze and work on the same design and same simulation from anywhere in the world, real-time, using only a single CPU with small memory footprint.
Big data techniques proven to deliver faster convergence
For example, this methodology can be applied during SoC integration phase to improve design convergence. At this phase, the blocks and hard macros are not necessarily clean and may not be placed in the most optimal manner at the top level. The combination of block and SoC-level issues often result in millions of power ground issues such as shorts, opens, high drop, EM violations, etc. Traditional methodologies that require dedicated hardware resources may take 7 to 8 hours to provide feedback on these issues. As shown in the image below, such flows not only are unable to help close block/hard macro issues effectively but can also drag out resolution at the SoC level.
On the other hand, an elastic compute platform leveraging big data techniques can reduce feedback time to 1 to 2 hours, allowing design teams to iterate multiple times during the day compared to the existing ability to solve the problem every other day. This reduction in turn-around time can change the way design closure happens, as highlighted in the image below. Such an approach can help reduce design issues faster, and in turn reach faster design convergence.
7nm design complexities are going to have huge impact on cost, resources and schedule. Managing them with traditional tools and methodology will significantly handicap your design efforts. In order to gain competitive differentiation, you will need to leverage advanced computer science techniques to drive better coverage, faster feedback and detailed analytics.