Squeezing The Margins

Monitoring systems and quickly adjusting clock frequencies improves performance for specific applications and operating environments.


Back in 2016, we looked at the MediaTek Helio X20, the first Tri-Gear mobile SoC. Tri-Gear is a step beyond ARM’s big.LITTLE concept of using two different cores that have unique power and performance characteristics, by adding a third core. The main advantage to this approach is having more core choices to best run workloads at better energy efficiency and performance operating points.

At this year’s 70th ISSCC held in San Francisco, CA, MediaTek presented a paper entitled “A 5G Mobile Gaming-Centric SoC with High-Performance Thermal Management in 4nm FinFET”[1] using a Tri-Gear ARMv9 implementation. The design is comprised of 4 Cortex-A510 high-efficiency cores, 3 Cortex-A710 balance-performance, and 1 Cortex-X2 high-performance core for an 8-core “octa-core” implementation.

Fig. 1

Figure 1 shows the power versus performance curves for the 3 different types of cores. The design also includes an ARM Mali-G710 3D graphics unit. As the title mentions, the SoC is a mobile gaming centric part, and the performance of the system is limited by thermal constraints, i.e. the system could run at a higher voltage and frequency, but the mobile environment limits the cooling possibilities so it’s necessary at times to back off from the highest performance operating point. The more inaccuracy that exists in the thermal management system the more margining that needs to be used to make sure that the SoC stays within its thermal constraints. This margining shows up as running the system at lower clock frequencies (and possibly voltages) to make sure that the system doesn’t exceed it’s maximum signoff temperature, Tmax.

Fig. 2

Figure 2 shows how the threshold used during the worst power scenario would engage clock throttling sooner than using a more intelligent threshold scheme that could still keep the system temperature below the Tmax(Sign-off) temperature. The more accurately the thermal response can be predicted the more performance can be squeezed out of the system by setting the threshold higher and allowing the system to run more often at a faster clock frequency.

Fig. 3

Figure 3 shows a simple block diagram of the monitor and sensor used as inputs into the Power Predictor to set the threshold temperature. All of this is also tied into the operating system and the OS makes use of knowledge about the current workload to help make better predictions.  This is all part of an Energy/Temperature-Aware Scheduling (E/TAS) scheme to improve the performance of the system while still running below the signoff Tmax.

Fig. 4

Figure 4 shows a comparison of the improved thermal management system incorporated in the work described in the paper to the original global throttling scheme and how the temperature shows less variation while running the indicated testbench workload.

Table 1

The results are shown above in Table 1 for the authors’ Smart Frame-Per-Second (FPS) control, which is a closed-loop controller consisting of workload prediction that also takes into consideration the PCB temperature. The results show some improvement in the Avg. FPS with more marked improvements in increasing the minimum FPS, leading to a smoother video gaming experience.

As engineers work to squeeze more efficiency out of their hardware, we will continue to see more sophisticated techniques for monitoring systems and for quickly adjusting clock frequencies. Building in these hooks allows for systems that can be better tuned to meet their applications and operating environments. This also enables their use in a broader set of applications with better energy efficiency.

[1] Bo-Jr Huang, et. Al., “A 5G Mobile Gaming-Centric SoC with High-Performance Thermal Management in 4nm FinFET”, ISSCC, pp. 40-42, 2023.

Leave a Reply

(Note: This name will be displayed publicly)