Dealing With Unintended Behavior

Second of two parts: Best practices to make sure a chip does only what it’s supposed to do.

popularity

Functional verification was already tough enough, but having to identify behaviors that were never defined or intended opens up the search space beyond what existing tools are capable of handling.

However, while you may not be able to eliminate unintended behaviors, a design team is not helpless. There are several steps that can be taken to reduce the likelihood of these problems getting into the design. And even if they make it into the design, there ways in which these issues can be handled so that extreme situations are avoided. (Part one of this series explored some of the ways in which unintended behaviors can find their way into both the hardware and software and some of the ways in which they can be detected.)

That doesn’t guarantee success, of course. As Serge Leef, vice president of new ventures at Mentor Graphics, observed: “It is almost impossible to find malicious hardware that adds functionality in the design if it includes third-party IP blocks. Visual examination of the IP is not practical, and trying to find corner-case behaviors through simulation is also unlikely because you do not know what to look for.”

Dealing with IP
These days most designs contain at least one IP block, and many designs contain as many as 100 IP blocks. “The design team needs rigorous discipline surrounding the inclusion of third-party IP into a design,” says Leef. “The IP has to be proven, certified – trusted. How you achieve that is a different question. This is the easiest way for malicious hardware to slip into the design.”

In many cases, only part of an IP block is actually used. “You may only be using 80% of an IPs capability,” says Jeff Hutton, senior director for automotive business solutions in Synopsys, “but if it is in a safety- or security-critical area, you will still have to verify all of the unused capabilities and ensure they are in a known state or tested to make sure they do not create problems. This is an engineering cost.”

At least there has been some development in securing the IP supply chain. IP fingerprinting is a technology that allows the detection of third-party and internally developed IP in SoC designs. “It can tell you what IP is in a design and if any of the files have been modified,” says Warren Savage, general manager for the IP division of Silvaco. “A growing concern in semiconductors is the insertion of malicious code into the hardware that can then be activated in a running system. Chip DNA Analysis can show that there are files that were detected in the SoC that are associated with the IP, but that were not put there by the IP owner. This is critically important in projects where contractors and subcontractors are handling the code, such as in the case of many government, military, and aeronautics projects.”

Many IP blocks are connected using an on-chip interconnect fabric. “For the Internet of Things (IoT), people are developing chips in a speculative manner,” says Drew Wingard, chief technology officer at Sonics. “That creates different design behavior compared to someone who is trying to design a whole system. A semiconductor company may be concerned about their liability, but it is a fast-moving market and this is not the greatest uncertainty that they have in terms of their product being attractive in the market. Right now it is the list of features and the amount of energy it consumes. Then come cost and security. If you are a large system company that has a brand that you care about, then you may be more careful.”

But this does not mean that the system has to be left unsecured. “Add target agents that sit in front of memory or peripherals or bridges, which is an on-network firewall,” suggests Wingard. “If you want to restrict a connection between a DMA engine and a USB core, set up the chip so that while the connection is there, it will generate errors if someone tries to use it. This is similar to the memory protection function that you find inside address translation or mapping function of a CPU, except it is at the target side. So it can determine on a per-master basis which masters, in which modes of operation, can access a resource.”

But hardware IP is not the only IP in the system. A lot of software IP goes into systems these days. “You have to pay more attention to your suppliers,” says Simon Davidmann, chief executive officer for Imperas. “If you go the web and download software, then you may not know how good or reliable it is. When you buy an IP block, do you know how well it has been validated? People buy IP blocks because they are hard to understand and implement. When you buy unqualified pieces you are putting your business at risk and people have to become more careful.”

There is increasing awareness about best practices to ensure safety and security in software and hardware. “We are still in the relatively early stages of understanding where the risks are,” adds Davidmann. “Methodologies and tools are evolving, which are helping people to build better products. If your product connects to the Internet, you need to have experts working with the team who know where the vulnerabilities are. Companies that do not do this pose a big risk to society.”

Such discussions often lead to a consideration of open source. “Open-source solutions have a variety of developers that intend to use the software platform for different applications,” says Majid Bemanian, director of networking & storage segment marketing for Imagination Technologies. “In this way they can validate almost all corner cases of the platform. In addition, the community exchange can easily highlight existing issues and concerns about both software and the target hardware platform.”

But there are always counter arguments to openness. “Consider the world of RTOSes,” says Davidmann. “Most of them publish source so that you can see what is going on. There are people like the free RTOS guys who have taken this further and have three different flavors – the generic free RTOS, which is open source, and then they have open RTOS and safe RTOS. Open RTOS is controlled more and you don’t have to provide source, and safe RTOS has been rewritten to meet much more stringent standards. The source for this is not open.”

Dynamic detection
If we assume that the hardware may contain a Trojan, even after rigorous verification, what can be done to keep the system secure? “Designers can consider adding a policing agent into the platform via an embedded or standalone dedicated security module,” says Bemanian. “Such a module can monitor and manage the operation of the platform during power-up sequence and run-time. This allows for the secure agent to observe a certain level of violation and respond accordingly.”

The architecture of the system may be helpful in this case. “The assumption is that you reach through a bus to access malicious hardware,” says Leef. “There is some logic in an IP block that is connected to the bus and, possibly, without knowing it, and some specific memory-mapped IO transactions or unique sequences of transactions directed at the IP block would trigger the Trojan which would then release its payload. The interaction with the bus can be caused by embedded software running on the host.”

If that is indeed the mechanism, then solutions can be created. “Some companies have committed hardware resources into their chip that are used for visibility and performance analysis,” says Gajinder Panesar, CTO for UltraSoC. “These can be repurposed for bare metal security. We have things like bus monitors that are protocol aware and transaction aware. We have other blocks that are status monitors that monitor generic signals and at runtime you can ask it to look for certain conditions. When those things happen, do something. You can say look at the transactions that go to a secure area of the system. You may have your fuses or your keys or your security engine there. If these transactions don’t come from this processor or this process on this processor they do something such as raising an alert.”

Better verification
While traditional verification may not be looking for a Trojan because it is not intended behavior, it does not mean that there are no ways to find it. “You can start to systematically walk through every line of code and injects a fault,” says Hutton. “Now the tests can be rerun and asks: does your environment notice it? If someone put in bad code, then it will be seen. When an unintended piece of code was inserted and a fault injected, then it if changes behavior that is not seen by the environment you either have a verification hole and either the design needs to be corrected or the verification improved.”

“Fault simulation used to be used to check the manufacturing of a device,” says Davidmann. “Today, they are using fault injection to see how the system behaves under errors and if that creates vulnerabilities. What happens if the packets you receive have errors in them? Can your system handle it or does it open an unintended door?”

Another way is to check off what is in the final design. “We are seeing some people doing a final checkoff at the polygon level,” says Hutton. “Each design element is matched up (possibly as part of LVS, Design Rule Checking (DRC) checks). This is the equivalent of doing a deadcode check at the higher level, but you are specifically looking for code that is unintended, and may not actually be dead.”

Indeed, dead code may be well disguised. “Using formal you can look to see if there is any code in there that you can’t activate from the inputs,” says David Kelf, vice president of marketing for OneSpin Solutions. “But even that can be overcome, and it can be set up so it doesn’t look like dead code. It does things and the outputs go somewhere. It behaves normally until it is supposed to change behavior. So you have to look at the functional structure of the design and the state space.”

While security and safety share some characteristics in this space, the goals are different. Safety is making sure that all of the requirements have been met and tested, so even a random fault during operation doesn’t interrupt the basic operation of the chip. Security is making sure that the chip can only do what is in the specification—and nothing else.
Industry progress
At the end of the day, the industry is driven by economics. “People are willing to pay maybe a 20% penalty for this in terms of added verification cycles,” says Hutton. “But even this is too much in some markets. For automotive, they have no choice because of the safety requirement that they must prove the safety mechanism in their design work, and that errors are detected.”

And there is little monetary motivation to do more than is required. “In general, people do not feel there is a compelling reason to do something unless they have been attacked,” says Leef. “People are not willing to pay for such solutions. You have to follow the money. Is someone going to make a lot of money fixing the problem? Will someone lose a lot of money by not addressing the problem? There is no clear answer to this question. The IoT is truly the Wild West and the stakeholders have not been sorted out.”

Thankfully, the automotive industry is looking at this more seriously. “If there is a security flaw in a car who is to blame?” asks Kelf. “If it is a safety problem, the car maker is on the hook. But what if someone breaches the system or code is maliciously modified? Are the automotive companies still responsible? Their conclusion is that they are.”

But this is not mandated by any government. “ISO 26262 is not a mandatory standard and no government requires it,” says Hutton. “However, it is best practice, and from a legal standpoint it sets a precedent. If you didn’t follow it, your liabilities are going to be greater when someone gets hurt. The National Transportation Safety Board, a U.S. regulatory agency, wants to take a hands-off approach and they would prefer the industry to come up with the ideas and to just say if it is good enough. They do not want to stifle an industry that is still immature, but once it matures, they may come in with some regulations.”

Conclusion
There is no silver bullet, but it is clear that attempting to cut corners raises the risk level. Many start-up developers for IoT devices may not care, so long as they have the right feature set at the right price, and that they can a product to market ahead of the competition. But if this or other industries decide not to self-regulate, as the automotive industry is doing, regulators may be forced to step in.

Related Stories
Uncovering Unintended Behavior (Part 1)
Does your design contain a Trojan? Most people would never know, and do not have the ability to find the answer.
Rethinking Verification For Cars
First of two parts: How the car industry can improve reliability.
System-Level Verification Tackles New Role
Experts at the table, part one: The role of system-level verification is not the same as block-level verification and requires different ways to think about the problem.