Big Iron Conundrums

Tradeoffs between power and performance are far less obvious inside data centers than in the mobile market.

popularity

Enormous attention is being focused on energy efficiency in mobile devices because time between charges trumps a slight boost in performance. Inside of data centers those benefits are far less clear.

While energy costs remain a huge factor—they are a visible part of the bottom line costs for a CIO—how to reduce those costs is anything but a simple equation. Just adding more energy-saving devices and powering down servers doesn’t necessarily save as much power as running servers more efficiently. There are a number of tradeoffs that all data centers scrutinize to lower those costs.

By far the biggest variable is server utilization. Regardless of whether a data center is used for shared services, such as those offered by Rackspace and Amazon, or whether they’re private clouds, which is the option chosen by most large corporations due to security reasons, the goal is to limit computing to as few machines as possible and to power down the rest. A server running at 60% to 85% of capacity is considered optimally utilized because it still provides a buffer to deal with workload spikes. And because most applications can be virtualized, they also can overflow onto other servers if the utilization rate spikes or if there is a server failure.

A second factor, and one that is more difficult to measure, is computations per clock cycle. Unlike most mobile applications, the majority of large enterprise-level applications can be parsed across multiple processors because they are database-centric. One clock cycle can involve multiple threads on multiple processors, meaning more compute power applied to each clock cycle can be more efficient than with a lower-power, lower-performing processor. There are still savings to be gained by improving a server’s overall efficiency with new architectures, but they aren’t as significant as utilizing what’s already available.

That leads to the third factor, which is the software. While energy consumed by software is a relatively new and increasingly important focus area in mobile devices, this is an almost incomprehensible subject inside data centers. They don’t swap out giant enterprise planning resource applications or operating systems because one can save more power than another. The far bigger costs are training, consistency of service and downtime. Knowing that software works and that many of the bugs have been worked out is critical in the server world, where downtime costs are enormous—sometimes millions of dollars a day. This helps explain why corporations are typically a release or two behind the consumer market even for PCs, and why the controversy about the future of Windows 8 hasn’t affected most of them. They’re still on Windows 7.

The result is that software generally is retrofitted for efficiency rather than designed for it. Applications are not new, but being able to fit more into a single clock cycle and spread more work across more processors in less time is sometimes more efficient than starting from scratch. And because most of the private clouds are owned by established corporations with sufficient resources to justify these costs, this is the typical route taken.

A fourth factor is the server architecture, and this frequently depends on the particular market sector. A Google or Bloomberg search, for example, is a consistent use of a processor. A corporation that does financial planning, runs an assembly line and develops marketing material with graphics will use processors differently, and if it relies on general-purpose processors its computing will never be as efficient as a search function. Efficiency in architectures is relative to individual companies and what they do with those architectures. Improving throughput may be the best that some companies can achieve, while others may be able to benefit from improvements in power-saving features and reduced overhead for redundancy in case something goes wrong. But in search, losing a query is far less important than in banking, where it can mean the loss of critical financial data.

Put all these factors together and several conclusions can be drawn. First, one size never fits all, and specific ways to improve power/performance in one company may be irrelevant in another. Second, nothing moves quickly in the IT world, which typically is the most conservative adopter of technology on the planet. And third, while energy remains an important factor inside of data centers—the cost of powering and cooling machines can run millions of dollars per year—the formula for calculating those costs are anything but simple. Energy efficiency remains an important consideration, but solutions by necessity will vary greatly.

—Ed Sperling