How well understood are domain crossings, especially when it involves multiple domains? They require a combination of methodology and tools.
Clock, power and reset domains can form a tangled web if systems are not architected correctly. Wires that cross these domains often require special treatment and additional analysis. They are all evolving independently, meaning that designers must keep up with the latest methodology guidelines and tool capabilities to ensure problems do not remain hidden until they get exposed in silicon.
Clock domains increase with design complexity and an increase in third-party IP, and they often are associated with I/O circuitry that needs to operate at specific frequencies. As leakage became an increasing problem for battery-powered devices, power domains were added, as well. This, in turn, significantly increased the complexity associated with reset. FinFETs were use to control the leakage, but clock and reset domains continue to increase in number and complexity. Designs that remain in planar technologies have to continue worrying about power domains, as well.
Fig 1: Multi-domain verification. Source: Mentor, a Siemens Business
Clock explosion
The number of clock domains in a modern SoC continues to grow. “What is driving the explosion of clock domains is the management of dynamic power,” says Pete Hardee, director of product management in the system and verification group at Cadence. “Ideally, you want to have all of the computation on a chip take place at its optimal clock frequency, such that all computation completes at the point in time when it needs to be consumed. No clock should be running faster than it needs to.”
Complexity can obscure issues. “The primary reason why we had multiple domains was complexity,” says Prakash Narain, president and CEO for Real Intent. “Increasing complexity creates greater fragmentation of the design, and we want to deal with things as hierarchically as we can. As a result, we have a lot more domains and we need to make sure that everything is working correctly.”
Others agree. “The more complex the chips get, the more clocks they have, and the more problems people have,” says Joe Mallett, senior product marketing manager at Synopsys. “When you didn’t have thousands of clock domains you could do the analysis later in the flow because it wasn’t that complex. As the number of domains has exploded, you must look at it earlier while you are doing the design. If you run the tools and get a list of 100,000 violations, you have to work out which ones to look at and how to start addressing them. Which ones are bug escapes and which ones are real or false? The tools have done a good job trying to get the noise down, get the verification engineers to see what is really important, and to find the bugs before they become costly. They need to start looking at this as early as possible, and this is part of shift left.”
Dealing with the issues is a combination of methodology and tools. “There is certainly a proliferation of methodologies and tools to verify mainstream crossing issues,” says Chris Giles, product marketing manager for Mentor, a Siemens Business. “However, as power, performance and area constraints continue to push designs, the domain crossings become less isolated and more integrated into full functionality within datapaths and control logic. These crossings become parallelized and distributed throughout many blocks of logic. This makes the crossings more complex for tools to find and verify, and for reviews to fully cover.”
Modality can make things worse. “Large SoC designs are becoming more modal as teams move to integrate more functional and product needs with one set of reticles and silicon,” adds Giles. “If each block in a design can function modally, the combinations of operations can exponentially increase the number of domain crossings that need to be analysed.”
For those who fully understand the issues, it may seem like a lot of worry over nothing. “There aren’t any new clock, reset or power domain crossing challenges which have emerged over recent years,” says Peter Greenhalgh, vice president of technology and a fellow at Arm. “It’s largely a solved problem with known techniques. Clock, reset, and power domain crossing challenges don’t meaningfully change between the mobile, automotive or server markets. The circuits need to be functionally correct, and that doesn’t change between markets. Also, what is needed to deal with these requirements doesn’t significantly change between CPU, GPU, ML or other IP. About the only difference is high-bandwidth IP and/or high-frequency IP that needs more care to maintain performance.”
Awareness of the domain issues extends beyond SoC design. “Clock domain crossing issues can also cause functional instability within FPGA designs,” says Louie de Luna, director of marketing at Aldec. “It is extremely hard to understand the cause of such issues. It may appear that the design is working perfectly in functional simulations, but when the design is brought up in the lab, the issues start to cause failures.”
Power domains
Power domains remain a problem for many designs. “The transition from 28nm planar CMOS technology to 16nm finFET technology came with a big change as far as power is concerned,” says Cadence’s Hardee. “Leakage used to be a huge problem because you could not fully turn off the transistor and so there was always some current flowing. That is what drove power gating. UPF style gating was an absolute necessity in those chip and power gating is about controlling leakage power. But in the finFET technology nodes, the solution concentrates around clock domains and clock gating to make sure that dynamic power is minimized. There is some overlap between the issues, and there are times when they interact.”
In addition, the issue does not extend out from the mobile phone industry as much as many other issues do. “The use cases tend to be a little different in other application areas such as automotive or AI,” says Synopsys’ Mallett. “Consider automotive, where the whole time the car is running. You will not turn off the driver assistance system. If there is any overlap between the functionality and safety, it will likely remain functional the whole time. So you will not be turning it on and off. Separate power domains do not make much sense because you are not able to save the power.”
Power and reset
When power domains came into existence, it necessitated a new approach to resets. “Retention registers are of interest when there is power gating,” says Hardee. “You need to have retention registers to store state when power is removed, such that power on reset can happen within the required latency. In the early days of power gating, the decision about the usage of retention registers tended to be a yes or no decision on a per-block basis. Today, they want to optimize what really needs retention registers. There is a cost associated with a retention register as compared to a regular register, so they are asking if they need all of the register to have retention or can they have just a few important ones be retained? You still need to remove all of the unknowns within the required number of clock cycles.”
Handling reset is a lot more difficult than it first appears. “Reset was not a well thought out activity in the design process,” says Real Intent’s Narain. “Power-on reset just reset everything, but now you have power managed designs and you can turn off a certain section of the design while the rest of the design is functioning correct. Then you have to reset that section of the design while the rest of the design is operational. You have to pay attention to both the resetting as well as coming out of reset issues.”
Resets can get very complicated. “The challenge with reset domains is that there can be different reset domains within a single block or there can be reset domains that span multiple blocks,” says Mallett. “When reset spans two different blocks, you might have a memory DMA that has come out of reset and it is feeding data into another block that has not yet come out of reset. If you do not check for that, and make sure they are synchronized between the two, you could have problems.”
This power-on reset problem put more focus on the general reset problem, such that designs without power domains started to look at optimizing reset. “There is also a cost associated with resettable registers versus non-resettable registers,” adds Hardee. “Designers are trying to optimize the entire reset sequence to minimize the register cost and maintain the required latency. If you are still using power gating, you have an extra level of complexity in that you have to optimize the resettable register and decide which of those registers need to be retention registers.”
Part of the problem is that the timing of the design is altered when power domain circuitry is added. “You have to make sure that sequential equivalence checking is power-aware,” adds Hardee. “You are trying to ensure that the design, either with or without a full complement of retention registers, when compared to one that has been optimized, has the same functional behavior.”
Thankfully, power, clock and reset do not all come together at the same time. “The cross connections between clock and reset are actually fairly minimal,” says Narain. “They are a problem, but not the dominant population of what needs to be signed off. RDC requires containment of the solution within a synchronous domain. CDC is asynchronous clock domains. So the design methodology is about isolating these.”
Domain crossings in automotive
Chips for automotive must consider functional safety. “The quality requirement is significantly higher because the failures can be more catastrophic,” says Narain. “The fact that you are building in redundancy using voting and polling just appears as additional logic to the tools. Because of the correlation and the time synchronicity of correlation that is required between these modules, one has to pay attention to the domain crossing principles, especially metastability, which introduces uncertain delays. So the requirement will be for redundant modules to be within a synchronous clock domain. You do not want asynchronous data transfer between these modules since they could cause temporal mismatch and loss of correlation between these modules that are supposed to be producing identical results.”
Those who do make that stipulation must do some additional analysis. “The redundancy mechanisms should consider domain crossings in order to distinguish between a functional failure requiring redundancy-based resolution, versus functional mis-compares resulting from domain crossings,” says Mentor’s Giles. “Mis-attributing the cause ought to be avoided here. Much of this can be solved in architecture, but as power, performance and area constraints push designs, the domain crossings could be pushed deeper into functionality, triggering implementation-caused domain crossing issues. It will be important to consider complete post-implementation domain verification to prevent domain crossing errors from compromising the safety of a system.”
Increasingly, more of the functionality of the car needs this level of analysis. “What used to be a ‘don’t care’ might have to be cared about today,” says Mallett. “A simple example is navigation systems, which used to be a don’t care for functional safety. But today, because the navigation system and driver assistance systems are now connected, both systems have to worry about functional safety. Now you have to go through every warning and make sure you have an answer for it. You have to go through every line in the report and make sure you have an explanation or an understanding if it can cause a system problem and a functional safety problem on top of that.”
Domains crossing in artificial intelligence
Most AI chips are being fabricated in the latest technology nodes, so power domains are not a big issue. “When a circuit is not needed, rather than seeing power gating we are seeing clock gating,” says Hardee. “So they will gate the clock whenever circuits are inactive. They are not looking at leaf-level clock gating that may be inserted by synthesis tools. They are making coarse-grained clocking decisions and implementing that in the RTL. When you play with clock gating in RTL, you need sequential equivalence checking rather than a logical equivalence checker. You are changing the clock schedule for those blocks — 95% of the usage of sequential equivalence checking is to verify RTL optimization of clock gating.”
It still needs to be architected carefully to avoid unnecessary activity. “At a high level, you can architect the system so that it is a scalable AI engine,” says Mallett. “You might define a network that runs in ‘this’ portion of the array and a bigger network that require two portions of the same thing. Then you can remove power to the pieces that are not being used. You could do this using power domains, or you could just hold the unused segments in reset.”
Conclusion
Domain crossing is continuously evolving. “The design methodologies are evolving, the applications are evolving, and this is putting continuous pressure on the tools,” says Narain. “This creates opportunities for the tools to evolve and enable a left shift for all of the domain crossing applications.”
Designers have to keep abreast of the latest advances. “Domain crossings, and their impact to silicon, can be either underestimated or not understood,” says Giles. “While the means to verify the crossings almost always exist, the methodologies and checks are often not adopted due to complexity, cost, or even confidence. If we’re still seeing issues in silicon despite a robust and prolific verification methodology for crossings, the issue may to some extent be in adoption.”
Verilog-AMS is about the only reasonable simulation environment for verifying power, but I can probably hack it into SystemVerilog. Referring back to the UPF isn’t a good plan.