Monitoring system metrics and tuning key parameters for optimal performance.
Silicon lifecycle management (SLM) is one of the hottest emerging topics in the semiconductor industry. Chip and system developers face relentless demands for ever greater performance, reliability, functional safety, and security along with lower power consumption and silicon cost. Key applications driving these demands include data centers, autonomous vehicles, complex consumer devices such as tablets and smartphones, and Internet of Things (IoT) devices. New and innovative ways of designing, manufacturing, and deploying semiconductors are thus required.
SLM addresses these challenges by gathering and analyzing data throughout the silicon lifecycle, from the earliest phases of design through implementation, bring-up, manufacturing, device test, and even product usage in the field. Some of the data comes from tools used in the various stages and some is gathered from sensors, monitors, and structures embedded within the fabricated silicon. This analysis is used to optimize the chip designs during development, in the manufacturing process, and in the field. Quite often, data from many chips are brought together in the cloud for deeper analysis to identify common issues and track trends over time.
Synopsys’ SiliconMAX Silicon Lifecycle Management platform initially focused on design, manufacturing, and test, but the platform recently expanded to debug, bring-up, and in-field operation with the acquisition of Concertio Ltd., the company co-founded with Tomer Paz and Andrey Gelman in 2016. A recent blog post by Steve Pateras discussed how this investment provides immediate benefits to SiliconMAX customers and enables some fascinating future areas to combine SLM technologies in powerful ways.
This post provides some details about Optimizer Runtime, the autonomous real-time performance tuning solution developed by Concertio and now available from Synopsys. It accelerates the performance of a system operating in the bring-up lab or in the field by applying low-level accelerations and tailoring the many system and application settings to work in concert with the currently running workloads. It runs automatically, continuously and in real time, producing a self-tuning system. All programs and applications run using the existing binaries, with no code changes or recompilations required.
At the heart of Optimizer Runtime is an AI-powered dynamic tuning agent that runs in the background on the system to be optimized. The agent continuously monitors system metrics and continuously optimizes performance through various system and application settings. As shown in the following figure, there are three dimensions to the analysis and tuning of systems. Within the operating system, there are many user-selectable parameters (sysctls) that few users actually access because this requires deep expertise. The optimizing agent has the knowledge to tweak these parameters appropriately and to apply ML in learning which tweaks improve system performance. The CPU hardware often has similar settings available, and these are tweaked as well in real time.
Users typically see performance gains of 5% to 10% through tuning the operating system and CPU settings. It is possible to obtain more dramatic gains for individual applications by optimizing the performance of the applications and the interaction between the operating system and the applications. Many applications have configuration options that can be tweaked by the agent for further optimization. There is no way that a user, no matter how experienced, can achieve the same results. A typical system may have 10300 possible combinations of settings in the CPU and software stack, far beyond the limits of manual trial and error.
Even higher gains can generally be obtained by using optional software accelerators to cache and batch system calls, accelerate writing to log files, improve memory allocation algorithms, accelerate inter-process communication, and more.
This approach can be applied to any software stack through the use of plugins that define the settings and parameters available for tuning as well as the metrics used to measure system performance. Plugins are available for popular operating systems and applications, and it is straightforward to create custom plugins for proprietary applications. Once this information is in place, the optimization agent runs automatically and autonomously in the background, constantly looking for opportunities to refine and enhance performance.
The entire optimization process requires no user intervention to deliver acceleration results, but the user has control over the scale. This has two advantages. First, the user can adopt a gradual approach, defining a few tuning parameters and metrics at first and then adding more to optimize more over time. The user can also control how often the analysis is performed, which can be helpful in systems that have limited available CPU bandwidth, such as IoT devices with relatively small embedded processors.
Although today’s focus is on edge AI analysis and optimization of software metrics, integrating Optimizer Runtime more deeply into SiliconMAX will enable additional capabilities. Connecting the metrics and results from multiple individual systems to deeper ML in the cloud will provide identification of common themes across the field and detection of trends over time. Connecting the edge and cloud analysis to the SLM hardware monitors and sensors in the silicon will support power optimization and enable predictive maintenance before silicon failures can occur in the field.
Silicon lifestyle management is proving a powerful and valuable way to optimize chip development, manufacturing, test, and deployment. There are many opportunities to combine SLM technologies in innovative ways to provide even more value to the semiconductor industry. A white paper is available to present the full scope of the Synopsys SLM vision and fill in some more of the technical details.
Leave a Reply