Testing AI Systems

What exactly are we trying to accomplish? So far, that’s not clear.


AI is booming. It’s coming to a device near you—maybe even inside of you. And AI will be used to design, manufacture and even ship those devices, regardless of what they are, where they are used, or how they are transported.

The big questions now are whether these systems work, for how long—and what do those metrics even mean? An AI system, or AI-enhanced system, is supposed to adapt over time. That means what is verified at signoff on the design side, and at various test insertion points from the lab through the fab, should function properly as electronic systems.

But in an AI-guided and AI-rich world of electronics, system behavior is programmed to change. There are weights applied to algorithms to reinforce certain behaviors, and those weights are supposed to shift over time as use models evolve and training data is updated.

Owners of one brand of cars complained for years that their vehicles learned driving habits while they were stuck in traffic—light on the gas and heavy on the brake pedal. When they finally got a chance to drive their sports sedans on unclogged roads, those vehicles accelerated much more slowly than expected. Likewise, systems that initially are used in one environment and sold into another may not behave optimally.

All of this becomes more complicated as AI systems interact with other AI systems, increasingly without human intervention. A factory reset to defaults in one system could have unexpected repercussions across other systems, which may or may not have been developed by the same vendor and which may or may not have the option for a reset. So while a system may give fair warning about replacing a module in an electronic system, regardless of whether that’s in a car, a factory, or some medical device, the actual replacement of the module could have unexpected consequences in overall system behavior.

So far, most of the focus on AI in verification, test, and manufacturing have been about building more reliable and robust products based on existing approaches for quality assurance. Test and verification methodologies are very good at this, and being able to build in domain knowledge and transfer that expertise with AI systems can help significantly on the manufacturing and reliability side. But when those same approaches are applied to AI systems, or AI-enhanced system, things tend to get blurry very fast.

The fundamental issue is that detailed specs are not applicable in evolving systems. Engineers are very good at aiming for a spec and hitting a bull’s-eye. They’ve done this with performance, power and area over the years, even if it required bending light and accelerating/decelerating subatomic particles, or depositing and removing single atoms.

But aiming for a distribution rather than a fixed number creates some new challenges. What is the optimal area within a distribution? Is it in the middle, or is it closer to the edge? And what happens when the initial design falls in the less optimal area of that distribution? Is it more likely to wander outside of the distribution over the expected lifetime of the system?

These are questions the semiconductor industry has never had to grapple with in the past. A distribution is supposed to be an acceptable range of behavior. But when the behavior itself is in motion, it’s like aiming at a moving target. It becomes much more difficult to develop and apply a test strategy that will guarantee an acceptable level of performance and energy consumption over a product or system’s projected lifetime.

This is a whole new challenge, and so far the chip industry is a long way from mastering it.

Leave a Reply

(Note: This name will be displayed publicly)