Designing resilient chips with SLM can help combat aging effects, security threats, and get to market faster with higher yields.
Silicon lifecycle management (SLM) is transforming chip architectures, empowering designers to build smarter, more resilient, and secure semiconductor devices by leveraging data from manufacturing to end of life in the field.
That data can be used to improve future designs, reduce margin, and continuously optimize performance and power efficiency throughout a chip’s lifetime. Moreover, understanding the full lifecycle can enable chip designs that are more resilient to aging effects and new security threats, with the ability to monitor and potentially mitigate issues after deployment.
“The ultimate promise of SLM is finding things that would probably never be found, to shorten that lifecycle, and to improve yield, reliability, security, time to market, and operational costs,” said Simon Rance, general manager and business unit leader for process and data management at Keysight EDA. “It is a challenge to put SLM components and blocks inside the design to support the telemetry and the test and debug capabilities, and it often requires experts to do this efficiently with quite a lot of expertise. While the upfront effort is quite a lot, it does reap a lot of rewards downstream, because you’re taking that shift-left type of approach, finding things almost in real-time, which allows you to iterate and refine a lot earlier and a lot quicker.”
Data collected by monitors inserted into chips and packages can also improve failure analysis and quality control, accelerating yield ramps and helping identify root causes of issues more quickly. That, in turn, can speed time to market, while opening the door to new services and business models that rely on long-term performance and reliability, such as in the automotive or consumer electronics industries.
“It requires that we look at the chip from the inside, and have a very high-resolution visibility at different physical parameters that will feed into our ability to characterize, per chip, what its expected measurement will be,” said Noam Brousard, vice president of solutions engineering at proteanTecs. “We have very high-resolution monitors in the chip, which can negate noise factors to look exactly at the process itself. We also do more novel things, like looking at path delays, measuring mostly in the latter stages of testing. And we’re comparing it with what we expected at pre-silicon using simulation. We’re looking at the timing margin of millions of paths, or the delay of millions of paths, compared to the simulation.”
While the benefits of SLM are clear, integrating these capabilities into chip designs introduces new complexities and considerations that must be addressed throughout the development process. This leads to important questions about how SLM strategies should be tailored to specific use cases and the architectural decisions that follow.
“One of the challenges with silicon lifecycle management is that it’s use case dependent,” said Randy Fish, product line director for SLM at Synopsys. “Test is very complex today, but what you’re trying to accomplish is still more or less understood. With SLM, which is a close sibling of test, it’s much more use-case driven. There are in-field use cases, where you’re going to use the information in mission mode. There are in-test use cases where you’re using it during scan, or during system-level test, and depending on what you’re trying to accomplish, it will impact your architecture differently. On a sophisticated chip today, when you have a large die doing yet another AI inference chip or training, you can have PVT (process, voltage, thermal) monitors with hundreds of sense points. It’s no longer just a single thermal diode on the corner of the die. It’s a complex infrastructure that feeds into a PVT controller, which is an RTL controller that a number of these can feed into. You can have multiple controllers, and they can go to APB or other interfaces that usually end up at a system control processor (SCP). A lot of people use Arm controllers or other sources. Then that data can be centralized there. Or, in the case of something like ABS, you may want that brought in to not only a software solution where it’s managing a PMIC externally, for example, but also to a hardware solution where it’s having to react very quickly. So you’re having on-chip LBOs and things that are very fast response, and you want that to be within cycles, basically, as opposed to your PMIC, which you’re operating at a much slower frequency response. This can be software-managed, so the architectures are really varied based on which problem set you’re trying to solve.”

Fig. 1: Sample architecture of SLM plus test solution stack. Source: Synopsys
As chip architects weigh these considerations and navigate the complexities of SLM integration, the practical implementation details become crucial. This is where expert insights and real-world examples shed light on the challenges and solutions that shape effective lifecycle management strategies.
“The first consideration is to have a clear understanding of which types of sensors/monitors are needed, how many, and where to place them,” said Geir Eide, senior director of product management for Tessent Silicon Lifecycle Solutions at Siemens EDA. “On one hand, design-for-test (DFT) structures can be re-purposed to facilitate high-quality in-field test. As these structures already exist in the design, the impact on design and the design process is relatively minimal. On the other hand, some sensors, such as slack monitors, are sensitive to the physical placement, and the overall quality of results will also depend on how many sensors are inserted into a design. Placement of these can also be a challenge, in that the most sensitive paths are not known until after place-and-route, and at that point you typically don’t want to insert additional objects (sensor) into the design.”
Another important consideration involves the infrastructure used to collect data from all the sensors, Eide said. “How often will measurements be done? Are measurements made continuously during functional operation, or less often as part of a diagnostic test? How much data needs to be collected? It can be advantageous to reuse a functional bus or other existing infrastructure, such as IEEE 1687 IJTAG, but this might not be feasible, depending on the bandwidth requirements and requirements for when the sensors need to be operated (for instance, in full functional mode vs. during structural test). Especially for complex designs, with lots of data and many monitors, it’s important to use an approach that scales well.”
All of this means SLM is changing the way complex SoCs are architected. “As scaling limits tighten and reliability margins compress, design teams are embedding dense networks of sensors, monitors, counters, and trace points throughout the silicon to expose real-time information about voltage droop, thermal gradients, timing slack, aging, and workload behavior,” noted Andy Nightingale, vice president, product management and marketing at Arteris. “This data is no longer a post-silicon aid—it is becoming an architectural input that shapes how power, performance, reliability, and safety are managed from first boot through end-of-life.”
For chip architects, the presence of SLM requires a shift in thinking. “Monitoring infrastructure must be planned early in the architecture phase, not bolted on during RTL. This includes defining where sensors are located, how their data is aggregated, and how telemetry flows coexist with performance-critical traffic. Interconnect fabrics now play a central role here: modern NoCs must provide isolation, predictable latency, and resilience features that allow SLM traffic to be transported safely at scale,” Nightingale said.
Then, once the technical foundations are established, chip architects must focus on the practical realities of implementing SLM within diverse system environments, balancing the intricate requirements of advanced SLM monitoring with the operational challenges inherent to complex SoC architectures.
“SLM systems can be either tightly or loosely coupled to the overall SoC architecture, depending on the defined need,” said Vikram Karvat, chief operating officer at Movellus. “Think about SLM in two buckets — sensors and actuators. Sensors can be loosely coupled, with minimal impact on the architecture. For example, a sensor network providing timing margin or process information may be as simple as interfacing to those sensors over standard interfaces like APB. When you are doing more than sensing, and real-time action is being taken based on the sensor output, actuators come into play. Then, the implementation of the combined sensing and actuation functions needs to be planned out in advance, as the SoC architect is using this combination to achieve performance targets and architectural guarantees in certain situations. Planning can involve partitioning of clock and power domains, DVFS/DFS architecture and capabilities, placement considerations, package design, PDN design, etc. Over time, there is, of course, the feed-forward loop from having rich telemetry that can be applied to next-generation silicon design and architectural decisions therein. This can be thought of in two ways — sensing versus actuation, and current versus follow-on silicon generations.”
In addition to hardware considerations, SLM should also include software that can be hooked into the design. From there, it needs to be validated in real-time and in-service. “Given our company’s roots in test and measurement, we’re finding new ways to pull all of those capabilities as early in the design cycle as possible,” Keysight’s Rance said. “It is a challenge, but we’re finding that when you do it right, and you use the capabilities and expertise that we have, we usually jump-start that a lot quicker, and help our customers get there a lot quicker, as well. It is almost a handholding across that lifecycle, not dealing with just one team. The design team, test team, measurement team, validation team, and manufacturing team don’t traditionally talk to each other. It’s usually a handoff right from one to the other. The challenge is how to get them all collaborating. How do you get them to share the right information at the right time? How is it automated in those workflows? That’s where workflows are helping. Those workflows are helping pull together all those teams, all those elements, all the data. How do we have that single source of truth that can enable all of this and trace it?”
Siemens’ Eide pointed to two examples of SLM implementations. “There have been many examples of hyperscaler companies reporting the impact of field errors in megascale datacenters,” he said. “These tend to be very difficult to detect. Many of these types of errors can be traced back to timing-related issues as the performance of transistors changes over time. Amazon Web Services published a paper on extending design-for-test methodologies and infrastructure typically used for manufacturing test to address such timing-related issues through ‘in-field IC monitoring, swapping parts as they fail, and going after field [diagnosing] failures.'”
In the second example, Meta described how its IC debug architecture facilitates in-field software debug and analysis, which is the step after monitoring. Here, the focus is on software performance, not IC defects or aging.
SLM can also be used for power monitoring and optimization. Movellus’ Karvat cited voltage droop mitigation, power network visibility to optimize DVFS setpoints, active DFS, and DVFS control. The goal is to enable repartitioning of code or scheduling to smooth out current draw or avoid hot spots, and intelligent adaptation to mitigate aging effects.
Other public examples illustrate how this is already influencing design choices. “Arteris’ FlexNoC and Ncore IP have been adopted in AI, automotive, and chiplet-based designs from partners such as AMD, Mobileye, and Blaize, where the interconnect is used not only for data movement but also for routing RAS, debug, and monitoring information across large, distributed systems,” Nightingale noted.
These practical examples and technical nuances highlight how SLM solutions are evolving from theoretical concepts to essential components in modern chip design. As the industry embraces more sophisticated sensor networks, integrated workflows, and adaptive architectures, the role of SLM only grows in importance, setting the stage for a new paradigm in silicon development and lifecycle management.
“Where AI-driven, self-evolving design automation is the foundation, SLM represents more than an architectural enhancement,” said William Wang, CEO of ChipAgents. “It becomes a data and intelligence substrate for the next generation of agentic EDA systems. In traditional design flows, architects make static tradeoffs between performance, power, and reliability based on simulation and pre-silicon assumptions. But with SLM, real-world telemetry feeds directly back into the design ecosystem, allowing AI agents to continuously learn from deployed silicon. This transforms the front-end process into a living, adaptive loop, where the RTL, verification strategies, and even architectural templates evolve dynamically in response to field data.”
For a self-evolving design agent, SLM data serves as both context and ground truth. “Process variation, workload behavior, and degradation patterns are not just monitored, but used to retrain optimization models,” Wang said. “Agents can propose incremental architectural refinements, re-parameterize modules, or even re-synthesize localized logic blocks for better efficiency and longevity. In this sense, the chip becomes part of its own design feedback mechanism, closing the gap between design intent and operational reality.”
To fully leverage this, front-end architectures need to be modular, parameterizable, and introspective, built with agent-readable intent so that AI agents can reason about tradeoffs autonomously. “The EDA environment must expose SLM data through standardized APIs and semantic layers that agents can interpret without human mediation,” he said. “Security and data provenance mechanisms become essential here too: agents must trust the data they act on, ensuring that any design evolution remains safe, verified, and explainable.
Ultimately, ChipAgents’ vision is to use SLM to turn silicon into both product and teacher. “Each chip deployed in the field contributes to a collective intelligence that shapes future generations of design. The frontier of EDA moves from static automation to active collaboration, where intelligent design agents and self-aware chips co-evolve in a continuous feedback ecosystem. The best hope might be bridging pre-silicon and post-silicon with AI agents,” Wang added.
Given these innovations and strategic considerations, the integration of SLM technologies introduces new layers of complexity and opportunity for system architects. As organizations move from theory and early deployment toward practical implementation, the next step is understanding how to harness SLM data and infrastructure for tangible benefits within real-world devices and workflows.
For SLM newbies
For the architect who hasn’t had SLM monitors in their devices previously, there are several questions they need to ask, which point back to the use cases.
“First, what are we going to do with the data?” Synopsys’ Fish said. “The low-hanging fruit is, ‘I just want access to this data. We’ve never been able to see inside our chip all the time, so I want to have the infrastructure and the telemetry to be able to view all the temperature, margining, voltages, and glitches as they’re happening.’ Or, ‘We want to gather the data over a period of time and then export it off the chip.’ How is the data going to be gathered on the chip? Which processor is going to gather it? Where is it going to be stored? Is it being buffered in memory? Is it being serviced across a baseboard management controller on the motherboard, or is it being serviced at the top of the rack, or is it calling all the way home like a car might do? This means the architecture of the telemetry is something the architect really cares about. Some of it comes down to speeds and feeds. If you just want to have a trigger occasionally, or only alert when it goes over a certain value, maybe it’s lower speeds, so you don’t need to buffer that much data. But if you’re trying to stream out a bunch of information because you’re profiling some workload, the speed of the protocols you’re going across, the size of the buffers, how often do you process that data to reduce it to useful information, because you can’t afford to stream everything all the time off the chip? Being able to intelligently process it is important because that affects how you size your system control processor. Is it an M0? Is it an M3? How large do you go on those today? Is it running Linux or RTOS, or is it bare metal?”
This is where specific IP may be useful to run, for example, scan sequences across PCIe, USB, or other communication protocols, to allow for in-field scan test, which then becomes part of the in-field experience, to have the ability to run system-level tests, or, for FUSA purposes, to run safety tests. There is increasing demand for in-field structural tests to try to isolate what’s happening in failed devices, as in silent data corruption scenarios, and testing is a key part of trying to understand this.
“We’re flagging those chips that are outliers before you get to the point where you have to compensate with redundancy or adding more margin or guard-banding,” said proteanTecs’ Brousard. “We’ll see not only if it doesn’t meet the expected timing margin in the path, but even if it does, how marginal is it really? That’s a novel way of looking at a chip — not just is it pass or fail, but is it passing very marginally so it might fail very soon and very early in its lifetime.”
With these foundational questions and practical considerations in mind, the journey toward effective SLM deployment requires a deeper exploration of the architectural choices and implementation strategies available to designers. By addressing the nuances of data handling, test infrastructure, and system integration, architects can better position their designs to fully realize the benefits that SLM offers across the silicon lifecycle.
“It really comes down to what problem the SoC architecture is trying to solve for,” Karvat said. “Is it power optimization, performance optimization, guaranteed performance, cost optimization in package and test, information to feed into higher-level software, etc.? This problem will determine what they need to consider. SLM can be simple or complex, and its usage can evolve.”
When making SLM selections, Karvat said chip architects and designers should consider SLM uses from bench characterization to mission mode, to in-field test, and how to leverage the same infrastructure at multiple stages. “Every SoC is different, so maintain flexibility by selecting components that are post-silicon-tunable and can be leveraged from design to design. Also, use open/standard hardware and software interfaces since no single vendor may have everything you need. You want to be able to mix and match components and have a software infrastructure that can be adapted to meet your needs in a heterogeneous environment. Software and firmware don’t have to be free, but they do need to be adaptable to meet your needs, including customization.”
Architects and designers also need to recognize that SLM spans the entire lifecycle: design, bring-up, optimization, field reliability, and fleet-level analytics, Arteris’ Nightingale observed. “Embedding sensors is only the start. The NoC must support QoS isolation for telemetry, security domains for sensitive monitor data, and the control paths that feed DVFS loops, error-handling frameworks, and self-healing mechanisms. Verification must extend to SLM coverage, including fault injection, timestamp synchronization, and event correlation. And the value is realized only when telemetry connects cleanly into firmware, drivers, and analytics frameworks. As the industry advances toward larger, multi-die AI accelerators, automotive safety platforms, and hyperscaler-class chiplets, SLM’s architectural footprint will continue to expand—making observability and lifecycle intelligence as fundamental as compute throughput or memory bandwidth.”
Fundamentally, SLM is a very broad concept. “For any specific implementation of SLM, it’s important to focus and understand what exactly the objectives are, which problems we are trying to solve,” Siemens EDA’s Eide added. “Are you trying to detect a problem and/or the ability to diagnose and resolve a problem? There may be many ways to address a particular requirement, from using sensors to a software-based solution, and a focused scope may increase the chance of success. Maybe of less significance to a designer, but overall very important, is that SLM is not just about the chip. It’s about the system. There are many pieces to the puzzle outside the chip itself, such as data transport, security, and analytics. Without those pieces, the sensors and monitors don’t do much good. This means we also must address questions like when a chip is inserted into a system that’s sold to an operator, who owns the data?”
Conclusion
Successful deployment of SLM technologies hinges on thoughtful architectural decisions, clear objectives, and an adaptable approach to both hardware and software integration. System architects must assess not only the technical requirements, but also the broader implications of data security, ownership, and analytics throughout the silicon lifecycle. By prioritizing flexibility, standardization, and scalability, organizations can unlock real value from SLM, transforming chips and systems into intelligent, collaborative entities that drive innovation and reliability in increasingly complex environments.
Related Reading
SLM Gains Traction, But It’s Complicated
Issues persist about how and where to add it in, and how to manage data; AI will help.
Leave a Reply