Rethinking SSDs In Data Centers

A frenzy of activity aims to make solid-state drives faster and more efficient.


Semiconductors that control how data gets on and off solid-state drives (SSDs) inside of data centers are having a moment in the sun.

This surge in interest involves much more than just the SSD device. It leverages an entire ecosystem, starting with system architects and design engineers, who must figure out the best paths for data flow on- and off-chip and through a system. It also includes the I/O IP, the physical distance of the storage device from the server racks. And it includes the interconnect fabric, which in the case of the cloud and large enterprises increasingly relies on silicon photonics.

There are four main issues that come into play around SSDs—latency, power, cost, and bandwidth to, from and between different SSDs. There is some overlap between these various elements, because power and performance may be traded off, depending upon the application, or simply accepted as the cost of doing business for some jobs. In every case, though, this is a balancing act between whatever is important to the users of that data. So a cloud communicating safety-critical information will have much higher performance requirements than a static records search or an online shopping transaction.

The SSD itself is one component in that decision. And while there has been a flurry of activity around improving the performance of SSDs, including how many layers of NAND can be piled on top of each other to increase the density and how data can be retrieved and stored on various layers, it’s only one piece of the performance puzzle.

For example, one of the key discussion points at next week’s Flash Memory Summit in Santa Clara, Calif., will center around what NVMe brings to the table over older-generation storage interfaces SATA (serial advanced technology attachment) and SAS (serial-attached small computer system interface), as well as what NVMe over fabrics can do going forward from here.

SATA is the dominant interface today in terms of units shipped for data center and enterprise SSDs, according to Cameron Crandall, senior technology manager in memory giant Kingston Technology’s SSD product engineering department. But that is changing.

Fig. 1: NVMe growth drivers. Source: eInfochips

“SATA and SAS have always been limiting factors for (SSD) performance and the industry has known this for 10 years,” said Crandall, who has been with Kingston for 20 years and brings expertise from years spent engineering SCSI and Fibre Channel storage arrays. “In the beginning, SSDs basically mimicked a hard drive. The standards bodies have been trying to figure out how to get away from SATA and SAS ever since.”

Then, SandForce (which LSI Logic bought in 2012 and sold to Seagate in 2014) developed and brought to market the NVMe driver so that NAND flash SSDs could attach via the ubiquitous PCI Express interface.

“PCI Express prevailed with the NVMe driver,” said Crandall. “That gave us the right physical connection on the system board. It gave a big performance jump over serial ATA and SAS. PCI Express was the right interface for SSDs to get that next big performance jump.”

Seagate today has internal standard PCI Express NVMe controller and custom ASIC controller teams, as well as a dedicated sales force and engineering resource team for the large public cloud companies. The cloud companies typically specify and integrate their own data center hardware including storage.

“We are big on NVMe,” said Tony Afshary, director of product marketing and ecosystem solutions for flash products at Seagate. “We are on our fifth generation of PCIe SSD storage to date. Storage over PCIe interface is extremely important to us. For any hyperscale or cloud-like, Web-like infrastructure build out, those are all NVMe. It has absolutely taken over SATA as the interface in this market. There are innovative ways that you can create storage in the data center that NVMe has really opened up.”

What’s new here is more predictable performance, along with the ability to create fabrics and pool groups of NVMe drives to share across servers.

“That is pretty exciting in the NVMe world, both in the hyperscale and traditional array enterprise market,” said Afshary. Seagate will show new innovations around the smaller ‘gum stick’ M.2 form factor at the August flash memory summit. That could be significant, because to date U.2 has stuck around because that form factor is easier to service.

One of Kingston’s core partners is Marvell Semiconductor, the merchant market leader in storage controllers for SSDs and HDDs. As Nick Ilyadis, Marvell’s vice president of portfolio and technology strategy, put it at the recent GSA Silicon Summit, “The era of SSDs in data centers is now.”

Marvell is supporting EMC’s XtremeIO data center SSD arrays, and Facebook’s “Lightning” custom storage array, which was shown at this spring’s Open Compute Project conference. Marvell has a multi-year strategy investing in new chipset architectures that add RAID and virtualization to SSDs in the data center, which will complement its core engineering efforts to get to 100Gbps SerDes.

“We have multiple developments going on in that area. That is a real area of focus for us,” said Nigel Alvares, vice president of marketing for SSD and enterprise storage.

Right now, one data center industry camp of integrators and engineers is excited about the prospect of bringing the critical flash translation layer (FTL) off of the semiconductor controller and on to the server side in data center storage arrays. This change in controller/SSD technology approach is dubbed “Light NVM,” or “open-channel SSDs.”

Fig. 2: EMC’s X-Brick, a building block of its XtremeIO array. Source: EMC

Performance limits
But others argue this only brings advantages with known workloads and control of the entire software stack. As a result, this is only relevant for a particular slice of the data center SSD market.

Jeroen Dorgelo, director of SSD marketing at Marvell’s Dutch engineering center, explained the engineering mindset behind this. “For enterprise SSDs, because it’s flash memory, you write your data on multiple devices at the same time. It’s all about redundancy. But some people in the industry said, ‘Hey, this is a lot of duplication. What if l take the overhead inside the SSD, what that FTL does, and put that on the server side?’ This is because of the basic difference between SSDs and HDDs. In HDDs, you overwrite the data. In SSDs, this is impossible. You have to write the complete file new. That means you have to translate your logical address to a physical address.”

The controllers with an off-loaded FTL are almost as complicated. This is really about the software being simplified, he said. The key is what the data center integrator intends to do, compute-wise.

“In a big, big data center, where a certain application is going to be performed and needs to be optimized, you can do (the translation) at the host level,” said Dorgelo. “But if you don’t know how the work will look, it’s better to do this at the controller. There are a lot of dynamics here with this ‘Light NVM’ for data centers that know their workloads. They can really optimize their total cost of ownership. But in the big mainstream data center, you are going to see them do the work close to the NAND. For certain workloads and big data center guys who really know what their workload is going to be, they will do this for a percentage of their SSDs, but not all of their SSDs.”

There are other efforts afoot to improve the performance of the SSDs, as well. “On one level this is an operating system problem,” said , CEO of Kilopass. “There is work underway on that, but so far it hasn’t been fixed. The problem there is the file system, which accesses the files, not the memory address space.”

In effect, this amounts to extra layers to retrieve data. So SSDs are slower than DRAM in a side-by-side comparison in a lab, but they’re even slower in the real world as more layers are added.

“PCIe has its own level of encapsulation, so you can’t access data directly,” said Cheng. “This is one of the reasons there has been a lot of interest in DRAM SSDs. With a DRAM buffer you can learn the application and prefetch some data, which can be stored in cache.”

A different kind of light
Even with existing SSDs, figuring out the best architecture and approach for a specific use is a complicated exercise. It involves a detailed cost analysis as well as an understanding of where bottlenecks can arise within a data center. Within a cloud-based data center, those bottlenecks can shift, as well.

In many cases, just moving data back and forth between the servers and storage can take too long, which is why photonics has become so popular. As a point of reference, the optical transceiver market is expected to hit $6.9 billion by 2022, based on 13.5% compound annual growth starting in 2016, according to MarketsandMarkets.

Photonics has its own set of challenges, though, which is why most of the photonics chips so far have been multi-chip solutions rather than everything on one chip. Typically, the chips are connected with a silicon interposer, or they can be assembled using a package-on-package configuration.

“There are some unique challenges,” said Michael White, director of product marketing for Calibre Physical Verification products at Mentor, a Siemens business. “Simulation is difficult. There also are parasitics with line-edge roughness on the waveguides. Maintaining uniformity and staying at a certain temperature is a challenge. A photonic IC also uses non-Manhattan (irregular) shapes.”

And that’s just the beginning. “Everything has to be tuned precisely to the wavelength of the signal,” said Juan Rey, senior director of engineering for Calibre at Mentor. “If you shift the temperature, that has an impact on the wavelength. There is a well-defined pitch on how it responds. A few degrees makes a big difference. On top of that, everyone is using III-V materials.”

III-V compounds such as gallium arsenide, indium arsenide, and indium gallium arsenide are used as the light sources for photonics, but they are difficult to work with using conventional silicon processes. As a result, the economics that have made semiconductor device scaling possible don’t exist yet on the photonics side. This has made them prohibitive to use for many applications, but the data center is one of the more forgiving environments when it comes to pricing.

Rey noted that the battles underway for communication in the data center are from 3 centimeters to 1 meter, within a rack, and at 400 meters-plus. Photonics works well in both settings due to low power and high speed.

Fig. 3: Silicon photonics chips. Source: IBM

New options
Beyond the data transport medium, some companies are looking to rearchitect the entire data center.

“People are creating so much data at such a heavy clip, in the near term, which means the SSD market is growing,” said Marvell’s Alvares. “The HDD market is growing significantly, too. Data storage in general is growing gangbusters.”

The problem is that there is so much data being generated, particularly from images, that moving that data to one or more processors isn’t necessarily the best option. This is well understood with DRAM, and utilizing new memory architectures such as HBM2 for 2.5D and fan-out wafer-level packaging greatly speeds up data throughput between the processor elements and off-chip memory. Moving processing closer to storage—which in effect creates yet another level of memory—is still at the PowerPoint stage, but it is beginning to get some notice.

This is particularly key for edge server architectures, which are still being defined, as well as for cloud operations, which are constantly being refined and redefined. One of the more ambitious schemes comes from ConnectX, which has proposed putting entire data centers in outer space. While the company talks about the cost and security advantages, there are other advantages, as well, that can be duplicated closer to home.

“There are two ways to improve performance,” said Craig Hampel, chief scientist at Rambus. “One is power scaling through . What’s clear, though, is that we need an alternative to traditional scaling. The second is quantum machines, and those need to operate at extremely low temperatures to limit thermal noise. But if you lower the temperature enough, you get multiple orders of efficiency improvement in the movement of electrons. You burn 1/10th of the power at 77 degrees Kelvin. You also get a second effect, which is thermal density. You can pack things closer together, which means the the wires can be shorter. So if a quantum machine can be built that is commercially acceptable, it will be 9 orders of magnitude more efficient, and memory will be 10 times more efficient.”

This applies to all types of data storage, but it is unlikely to be used outside the enterprise because the economics of cooling a device to those kinds of sub-zero temperatures don’t work except on a mass scale—or in outer space.

Data centers historically have been one of the most conservative markets for technology. They are slow to adopt technology and slow to change it out because the impact of those decisions gone awry can be enormous. But as the cost of cooling higher-density racks of servers increases, and as the amount of data being stored in data centers continues to skyrocket, they are much more open to looking at new approaches than in the past.

This bodes well for solid-state storage, photonics, and any architectural changes that can save significant amounts of money. And it indicates that companies investing in this area are at least on the right track, even if all the pieces aren’t entirely clear.

—Ed Sperling contributed to this report.

Related Stories
Sorting Out Next-Gen Memory
A long list of new memory types is hitting the market, but which ones will be successful isn’t clear yet.
Cloud Computing Chips Changing
As cloud services adoption soars, datacenter chip requirements are evolving.
Chip Advances Play Big Role In Cloud
Semiconductor improvements add up to big savings in power and performance.

Leave a Reply

(Note: This name will be displayed publicly)