Chip Advances Play Big Role In Cloud

Semiconductor improvements add up to big savings in power and performance.


Semiconductor engineering teams have been collaborating with key players in the data center ecosystem in recent years, resulting in unforeseen and substantial changes in how data centers are architected and built. That includes everything from which boxes, boards, cards and cables go where, to how much it costs to run them.

The result is that bedrock communication technology and standards like serializer/deserializer (SerDes) and Ethernet are getting renewed attention. Technology that has been taken for granted is being improved, refined, and updated on a grand scale.

Some of this is being spurred by the demands and deep pockets of Facebook and Google and peers, with their billions of server hits per hour. Some of it also is being driven by the deceivingly complex world of cloud computing and the shared services enterprise model for billions of workaday accountants, salespeople, retail points of sale, and logistics professionals, among others. But nearly all of it can be traced back to advances in semiconductor design, which has both provided the blueprint for what happens at a system level as well as the capabilities to improve the performance and power of complex systems at a lower cost.

Fig. 1: The price of power—data center cooling towers. Source: IBM

“There has been a relentless progression with performance and power scaling to the point where computation almost looks like an infinite resource these days,” said Steven Woo, distinguished inventor and vice president of enterprise solutions technology at Rambus. “And there is a lot more data. You need to drive decisions on what you put, where, based on that data.”

This requires hyperscale data centers — football-sized fields of servers and switches and storage drives, which are physically connected by cables and components for power, as well as cables to carry the data signals.

Fig. 2: Cloud-scale switches. Source: Cisco

Cisco Systems has ushered in a new era of data center topologies that account for this new reality. First, data center fabrics have to understand exactly what the server tenants (the software) need and the usage models. That entails intricate new “spine and leaf” architectures, which allow for virtual network overlays. In the context of today’s cutting-edge IEEE 802.3by standard, which is uses 24-Gbps lanes to achieve 100 Gigabit throughput speeds, this is one place where chipmakers get involved.

“A lot of these are concepts and waves of thinking in data flow architectures of the 1980s, and they’re making their way back,” said Woo. “But they’re very different now. Technologies have improved relative to each other and the ratios against each other are all different. Basically, what you’re doing is taking the data flow perspective and optimizing everything.”

Minor considerations, big impact
Optimizing everything is how Marvell Semiconductor sees it, as well. Marvell continues to churn out at Ethernet switch and PHY silicon, but performance demands are rising—and the payoff for meeting those demands is greater. The cabling between the top-of-rack Ethernet switches and the array of servers beneath them may seem like a minor consideration, but it has big impact for the data center design, cost and operation. The best SerDes enable 25Gbps throughput, but they also have long-reach capability that allows for ‘direct attach’ without supplemental power.

“Passive cables have really revolutionized [data center engineering],” said Venu Balasubramonian, marketing director for the Connectivity, Storage and Infrastructure business unit at Marvell. “There is no power associated with the cable itself.”

This potential brought together a worldwide “meeting of the minds” among power users like Google, the rest of the industry, and IEEE to have a 25Gbps standard, and not go directly from 10Gbps to 40Gbps. Not only is power supply removed within the rack, but equally as important, the backplane can be copper, not fiber.

There are a couple of key pieces of infrastructure that can be removed to lower power and improve cost efficiency in the data center. Top on the list is the power infrastructure between all the servers and switches within it that are 5 meters and less apart. Second are the the expensive optical transceivers, and all that comes with them, again, within the data center and the racks themselves.

Engineering teams are working overtime to develop 802.3by-capable silicon and systems in light of all of this. At the upcoming DesignCon show February in Santa Clara, Tektronix plans to focus heavily on its set of solutions for design and debug of this key data center technology, such as pulse amplitude modulation 4 (PAM4) electrical PHYs for the standard.

“Linking protocol analysis with the scope is so powerful,” says Chris Loberg, senior marketing manager of performance oscilloscopes at Tektronix. “You can drill back deeper into the physical layer and find out where it fails electrically. Learn what really caused that behavior and fix what caused it. That’s a powerful example of how the scope as a tool has changed.

“We also just introduced something called ‘link training’ where you are decoding a communications link between Ethernet transceivers and replicating that link between 10Gbps and 25Gbps.”

Consolidation effects
Broadcom has significantly consolidated the field of players that can engineer and build this top-of-the-line SerDes. Ten years ago, there was Broadcom, LSI Logic, Agere Systems and Agilent. Today the four are one. And the competition with Marvell remains fierce.

Marvell’s Balasubramonian, like many other semiconductor executives, takes the tough demands of the data center in stride.

“Most of the cost of running a data center, the ongoing OpEx, is in power and cooling,” he said. “But at the same time systems makers don’t want to pay more for their silicon. There is considerable pressure to drive down or keep costs flat. We have to add bandwidth without adding costs. We have to deliver the components that drive down their power and cooling costs, without raising our own.”

Marvell uses ARM cores in many of its switch families, which helps keep the silicon power consumption low. ARM has spent decades perfecting that.

Rambus has been working on a project it calls smart data acceleration, an approach it believes is essential for everything from data centers to supercomputers. The idea is that because the amount of data is exploding, it makes more sense to bring the processing to the data rather than the other way around. The server CPUs within the rack attach to DDR memory, but in the switch fabric, this memory isn’t useful.

“The CPU must use DDR,” said Amit Avivi, senior product line manager at Marvell. “But the switch level bandwidth is way too high to use DDR. Advanced switch (silicon) within the switch (device), optimize the traffic to minimize the memory needs. There is lots of prioritization, and there are lots of handshakes to optimize that traffic.”

But memory makers are still key, as always. Micron is building out its business in solid state drives (SSDs) for the data center, which are flash memory-based. This fall Micron has come out strong that its new S600DC series SAS SSDs have reached price parity with SAS hard disk drives (HDDs). The company touts how much more power-efficient this is, and the reduction in costs for the software licenses that manage and virtualize HDDs based storage arrays.

“It may sound like a small change, a 2-watt savings,” said Greg Schulz, founder of storage IO of Minneapolis, in a recent Micron webinar. “Over 100 drives, that’s 200 watts. Small changes at scale add up very, very quickly.”

Related Stories
Rethinking The Cloud
As the volume of data increases exponentially, companies are changing their approach to where data gets processed, what gets moved, and focusing on the total price for moving and storing that data.
New Metrics For The Cloud
Better performance is a relative term, depending upon the type of chips, what algorithms they’re using, and how much it costs to run them.
Executive Insight: Simon Segars
ARM’s CEO digs into the future of mobile and the IoT, why ecosystems are so important, and what’s changing in the data center.
Executive Insight: Charlie Cheng
Kilopass’ CEO talks about how to cut the capacitor in DRAM and why that’s important in the data center.
Silicon Photonics Comes Into Focus
Using light to move large quantities of data looks promising, but gaps remain and the adoption timeline will vary by application.

  • Milton Moss

    Go Rambus