Power Modeling Standard Released

IEEE 2416 is the newly released standard for power modeling. What can we expect from it?

popularity

Power is becoming a more important aspect of semiconductor design, but without an industry standard for power models, adoption is likely to be slow and fragmented. That is why Si2 and the IEEE decided to do something about it.

Back in 2014, the IEEE expanded its interest in power standards with the creation of two new groups IEEE P2415 – Standard for Unified Hardware Abstraction and Layer for Energy Proportional Electronic Systems and P2416 – Standard for Power Modeling to Enable System-Level Analysis. That was in addition to IEEE 1801-2018 – Standard for Design and Verification of Low-Power, Energy-Aware Electronic Systems, more commonly known as Unified Power Format (UPF) 3.0. The goal was for IEEE 2416 to become the Unified Power Model (UPM) that fed information into UPF. That work has now concluded, and IEEE 2416-2019 was released July 31st.

Semiconductor Engineering spoke individually with Jerry Frenkil, Si2 director of Open Standards, and David Ratchkov, founder of Thrace Systems, to discuss the new standard.

SE: Why is an industry power model so important?

Frenkil: Fundamentally, UPM is a system-level power modeling standard. It has multiple data representations and these models can be used at a variety of levels, but the main focus was the system level. We now have a standard format for both internal and external models. This should save resources within the industry.

Many years ago, we looked at low-power design to determine what was and wasn’t working. What was identified was that there was no good solution for power models. The industry has gotten along fairly well for a long time with gate-level power models and the Liberty format, but beyond the gate level, it doesn’t work very well.

Today, if you are building an SoC, you almost certainly are using IP from multiple sources. There is no common format for the power data. The large companies, in particular, often have an internal format, and they will translate external formats into their internal format. This is a painful process. One member commented that the value of the standard is that it will enable them to avoid an obsolescence trap of relying on an internal proprietary format.

SoC designers may have over 100 libraries, and each may be at different process/voltage/temperature (PVT) points. If they find that they need a different PVT point, they have to go back to the model providers, and they will consider that to be an extra charge item with a delivery latency. So, model consumers just estimate most of the time.

The goal was to develop standardized interoperable power models—interoperable, meaning they can operate at different levels of abstraction and with different tools. But the focus was really at the system level, because that is where we saw a big hole.

Ratchkov: 2416 is going to be very important because it solves some major problems that users see. One of the big ones is being able to carry power data, starting with equations, tables or actual characterization numbers through the entire design process, starting from architecture, through timing, and through to tapeout. The continuous large-scale integration of components onto a single die drives the necessity to understand power much earlier than before.

SE: You mentioned Liberty, a standard originally developed by Synopsys and now an open standard, which captures information about cell libraries. This includes timing, power, area, connectivity and environmental operating information, etc. Could that have been extended?

Frenkil: We came up with a number of advancements to modeling and we talked to the Liberty tab, and they did include some of our enhancements in one version. But they shied away from the system-level, telling us they didn’t see that as an application for Liberty.

SE: What aspects of 2416 are new?

Frenkil: Multi-level modeling is new. We can have one set of data that can be cast into a bit or gate-level type model, or into a system-level model. So there is a single set of data with multiple interfaces. Depending upon the kind of tool you are using, be it a gate-level tool or a system-level tool, or 1801 compatible interface, the same set of data is used for both interfaces. This ensures consistency.

The VT independent technology was contributed by IBM and GlobalFoundries. We enhanced it within Si2. It allows us to build a power model that does not have power data in it—it has power proxies. That enables a user—an SoC designer wanting to get power data for his design—to specify the voltage and temperature conditions at simulation run time. This has benefits for both the model provider and the SoC designer.

It is actually process-independent, too, but there is a limit to that. To get process independence, we do have to do some pre-characterization, but it is a small fraction of what one would do for a Liberty library. Once we have done that, the user can specify any V or T within a normal range, and then also minor changes to the process definition. If you wanted to evaluate what happens if you tighten up your distribution on gate oxide or on transistor thresholds, you can do that at simulation runtime.

Ratchkov: When an IP provider supplies power data, they may provide it at fixed points where they have actually measured it. But then the burden is on the user to figure out what the power numbers could be at other voltage and temperature points. At characterization time, it is not as big as it sounds. There are methods, and IBM and GlobalFoundries have donated patents about how this can be done. There is some characterization effort necessary, but it is a one-shot deal for components such as memories, and from there on you just use that data.

If you look at the flow today for standard cells, you have to characterize them at different process points, and then carry that into the IP and generate the same power at those process points and provide that data without silicon validation. The characterization in 2416 requires you to characterize everything at one single process point. The patents then allow you, from one voltage and temperature point, to move away from that point to any other point that you desire.

Companies are wising up and they have to look at the actual use cases. To do that, they need to model the scenarios. So we are no longer talking about estimating power for a gate. We are talking about actual use cases of large blocks. How do we model those so that we can understand what the power consumption is going to be as early as possible?

Inside the standard
SE: The standard talks about power proxies. What are they?

Frenkil: We have the concept of contributors which are proxies for power. So, for dynamic power or energy that is basically a switch capacitor. For leakage power it is a transistor.

Fig 1: Power contributors in a NAND gate. Source: Si2

What we do is look at a block of logic and here (Figure 1) we show a NAND gate. In each of the NAND gate states, we see that there are different transistors that are leaking. For a NAND gate, we don’t see a lot of reduction by only looking at the leaking transistors. But for a larger piece of logic, it can become substantial. We only have to represent those transistors that leak. On the bottom left we have a stack of 2 N-channel devices. In the 0-0 state, those 2 transistors are leaking. They leak sub-threshold current. In the 2 P-channels, they are leaking through gate leakage, and so we don’t have to model the whole NAND gate, we just have to model the leaking transistors. We get about 100:1 compression on simple primitives. As the logic gets larger and more complex, the compression is even higher. In the library, we have these transistor contributors. We list which are leaking, what is their width and length and then we calculate power using our built-in transistor contributor values. We take the BSIM SPICE model, we strip away what we don’t need, and we run it in close form fashion which enables us to avoid the Newton-Raphson iterations (Figure 2). So we have no convergence issues. It runs very fast, and we are not doing it on all of the transistors. We are doing only on a few representative transistors. This is how we get PVT independence.

Fig 2. Modified SPICE simulation flow for proxy power. Source Si2

Consider that you have a network interface block and you want to build a power model. You have to simulate near exhaustively everything it can do, capture the power data, and then format that to look like a gate-level structure in Liberty. Often that is not possible. You get state explosion. There are too many different events. For all practical purposes it is close to impossible. This is why you don’t see good power models for anything larger than a SCAN flipflop.

In addition to these contributors, you can use tables, scalers or expressions. This is all in a standardized, formalized manner. It removes the need to build ad-hoc models and enables the use of standard modeling practices. As this gets rolled out and adopted, we will start to see people modeling stuff in a standard way and will get away from the ad-hoc usage that is common today.

SE: You mentioned power through expressions.

Frenkil: Right. In Figure 3, an expression represents the dynamic power in an active state. In the expression, it is parameterized (BP_mispredicts, Cache_misses, Loads, Stores). This is a reduced version made to fit on a slide. In the real model, there were about 20 of these. This is exactly what Arm had done. They chose a set of performance parameters and they monitored these performance counters and then they used regression fitting and correlated the power with these parameters and built an expression for it. Then they ran that model on a number of workloads and showed good correlation.

Fig 3. Power expressions. Source: Si2

Tool Flows
SE: Getting a standard such as this into common usage involves a lot of moving parts and players. What is working today?

Frenkil: On the left (see Fig. 4, below) is what we have running today. The right is where we are heading. The flow, the data transfer is fairly conventional. You run a simulation, collect VCD, convert that to SAIF, and then the activity data is read by the power tool. It reads the models and then also reads a set of conditions that are frequency V and T. These can be instance-specific. It post-processes the data. On the right is the UPF flow. The 1801 vision is that you have a simulator and as it is running, when the logic enters a different state, it triggers a call to the power library. This is what they call a power expression, and it is basically an API call. It sends the parameters to some object—those parameters are used to compute the power—and returns static and dynamic power to the simulation tool. It then does whatever it wants to do with it.

Fig 4. Incorporating UPM into a flow. Source: Si2

Ratchkov: When you define the architecture, you start with UPF. You have some need to do architecture planning and you want data about about the use cases and the power. Then you go and figure out the actual UPF. As you move across the flow, you develop the blocks and you get better ideas about what the power is. You would update your UPF models, and then you can rerun the simulation and get better, more accurate, power numbers.

The benefit of UPM is that you can create it early on. You can carry it through the flow, and you can keep refining it with better data. The UPF description and the UPM descriptions have to be in sync. If the UPF model for a block defines 2 states, then the UPM model have to define those 2 states. If the person creating the model want to create a fine-grained model, and have 25 states, then they must have an expression that describes each of those states.

SE: You need IP providers developing models, SoC developers using models, and tools to make it all work. Who starts first?

Frenkil: It is like the middle-school dance. All the kids want to be out there dancing, but nobody want to be first. Nothing happens until some brave couple starts to dance and then the rest follow. It will take some end user asking for these benefits. I believe we will see a small number of major design organizations begin to pick up on it, and they will begin to pressure the IP developers and the tool developers to support it. This already has occurred. One group was so taken with the idea and they called their IP provider and said, ‘We have to have this.’ Within Si2, we will work with those companies that want to work with us to help put it all together. PowerCalc (a prototype tool developed within Si2) will serve a central role in that. The early adopters will be willing to work with a new tool and new standard and will get it working within their organization. They will be able to demonstrate the benefits internally, and then they will apply pressure to their providers, be they IP or tool supplier either internal or external. After this happens a few times, commercial EDA and IP providers will take note and begin to support it.

SE: Thrace is a startup that wants to be an early provider of tools that will enable the industry to start getting the benefits of IEEE 2416.

Ratchkov: I started Thrace Systems last year (2018). The goal was to look at power from early architecture to post-silicon, and from die-level to system-level. We want to be able to model power and provide insights into power for different use cases.

We have a prototype, which we will be benchmarking with some customers. I hope that with the right models and tools they will push the limits of design a little more. Being able to understand what your device does provides confidence about if you can add extra features or not, if the architecture will work as intended. That will allow users to push the boundaries, and hopefully lower power.



Leave a Reply


(Note: This name will be displayed publicly)