Low Power-High Performance

Power Or Performance?

Most of today’s processors aren’t run according to the standard performance specs, which has an effect on power consumption.

June 10th, 2010 - By: Ed Sperling

By Pallab Chatterjee
Most microprocessors have shifted to new small geometry processes in order to be the most efficient at power and high performance. However there is always a trade-off between power, performance and area (PPA) for semiconductors, and this is especially relevant for processors. In the current design space, processors are created as general-purpose products, but they are generally put into user applications that need to be optimized for either power or performance.

The main CPU processors, such as Intel’s iX series, AMD’s Phenom II series, and Nvidia’s video GPU products are routinely not operated at their standard performance specifications. They are either over-clocked or operated at alternate cores voltages in the end-user applications to increase performance and data throughput. Because the processors are operated in a non-standard condition, the design requirements have to include acceptable limits for these additional modes of operation. The chip cores can either be run at a higher voltage—up to 50% more than the standard voltage. The main clock rate for the chips, the core master clock, may be as high as 50% faster than the nominal clock frequency. To support all of the other functions such as thermal management, I/O and memory interface, and the standard bus handshake, the chips have to have additional control logic to support operation at different performance specifications.

Nvidia’s GPUs support additional power supplies modes and connections as standard. The nominal core voltage is 1.2V, and can be increased up to 1.4V. This configuration alone does not maximize the performance. Additional adjustment of the over-clocking of key portions of the chip need to be performed both with and without the voltage adjustment. This over-clocking needed to be balanced for which portions of the chip get the performance increase so the design does not overrun the local memory or the bus interface and introduce wait states. When these performance changes are made, they are a static change thath affects the overall configuration of the graphics board and fan.

Parameters that can be adjusted include: the FSB, memory bus, AGP bus, PCI-E bus, GPU core clock, GPU memory bus, memory timing registers, and hardware-specific performance tuning registers. As these changes affect the dynamic power of the board, fan and cooling controls are included to help keep the design at the nominal die operating temperature. The higher-performance operation can increase the die temperature by as much as 20C if upgraded cooling is not applied. Due to the complexity of the performance enhancement, the voltage scaling and clock scaling are no longer done by just putting in a different regulator and a different crystal.

To control these changes and make sure that the chip still operates in a safe design area, Nvidia has produced a control software program nTune for end user to adjust these parameters.

General purpose CPU processors have a few more data dependencies than GPUs, but have the same customer performance issues. Since CPUs were introduced people have been pushing the performance aspect of the PPA tradeoff. Just like GPUs, you can adjust the core voltage, and also over-clock portions of the chip. Unlike the GPU, the setting are not static and do not produce the same results under all data conditions. For this reason, the higher performance processors now have automated algorithms for performance improvement based on the data set.

For the Intel processors this is part of the “Turbo Mode,” which does an automatic over-clocking for the duration of the processor operations that need the higher performance. The power envelope for the processor design, including the thermal management, has to take into account these dynamic over-clock modes in addition to traditional systematic over-clocking. Unlike most SOCs, processor designs and most multi-core embedded designs have data-dependent timing and performance characteristics as well as user adjustable applications ranges.

Ed Sperling

(all posts)
Ed Sperling is the editor in chief of Semiconductor Engineering.

Power Or Performance?

Ed Sperling

Leave a Reply Cancel reply

Technical Papers

Knowledge Centers
Entities, people and technologies explored

Related Articles

RISC-V’s Increasing Influence

Development Flows For Chiplets

New Data Center Protocols Tackle AI

Chiplet Tradeoffs And Limitations

Implementing AI Activation Functions

Die-to-die Interconnect Standards In Flux

The Best DRAMs For Artificial Intelligence

Future-proofing AI Models

Sponsors

Recent Comments

About

Navigation

Connect With Us

Power Or Performance?

Ed Sperling

Leave a Reply Cancel reply

Technical Papers

Knowledge Centers Entities, people and technologies explored

Related Articles

RISC-V’s Increasing Influence

Development Flows For Chiplets

New Data Center Protocols Tackle AI

Chiplet Tradeoffs And Limitations

Implementing AI Activation Functions

Die-to-die Interconnect Standards In Flux

The Best DRAMs For Artificial Intelligence

Future-proofing AI Models

Sponsors

Newsletter Signup

Popular Tags

Recent Comments

About

Navigation

Connect With Us

Knowledge Centers
Entities, people and technologies explored