Verifying complex SSD controllers requires more flexibility, earlier.
By Ben Whitehead and Paul Morrison
The storage market demands that huge amounts of data and information be stored securely and be accessible anywhere and anytime, driving the adoption of key technologies and use models. According to GSMAintelligence.com, newly created digital data is doubling every two years. This means increasing amounts of storage must be available at the same pace.
As reported by Statista, Hard Disk Drives (HDD) continue to be a dominant source of bits shipped, but solid state drives (SSD) are on the growth curve. The capacity, size, and performance of SSDs make it a very interesting technology for current and future applications. Moreover, cloud computing is enabling storage to be more convenient and easier to use. Non-Volatile Memory Express (NVMe) over Ethernet or fibre channel is becoming a leading solution for connecting appliances to servers.
However, hardware challenges for SSDs (and HDDs) are substantial — and increasing, especially as more players enter the market with ever faster, higher capacity SSDs.
A primary component of SSDs are complex controllers that must perform a myriad of tasks to receive, monitor, and deliver data accurately and reliably. Each controller requires an algorithm and firmware (FW) to manage the complexities of writing and reading the various types of flash. This media is changing rapidly with the arrival of NAND, 3DNAND, 3DXPoint, and future technologies. Competition is fierce with the battleground looking much like that of the HDD industry in its early years.
To ensure these controllers are built optimally and delivered to market quickly, an increasing number of controller design teams are turning to emulation-based verification methodologies. Let’s take a look at why this is the case.
SSD controller verification challenges
As the industry moves to SSD, the controller faces a surprisingly high number of completely different challenges. Performance of the back-end NAND channels can now saturate a PCIe bus. This was never the case with a spinning disk. It subsequently requires accurate architectural modeling to ensure that power and performance trade-off decisions meet requirements.
Managing NAND, in all its various types, requires complex wear leveling, table-management, and garbage collection, in addition to all the interface requirements of a hard drive — security, compression, and error correction code (ECC).
Figure 1. SSD system verification complexity has hit the wall.
Additionally, there are no easy ways to measure real performance until the system-on-chip (SoC) is in the complete system with real firmware. In this case, verification engineers can estimate performance, but habitually miss something.
To check for optimal characteristics based on the shipping configurations, engineers choose to implement A/B testing or split testing. This means running verification for different NAND, different size drives, different configurations, and different connectivity. This is possible, but challenging with FPGA prototypes, and close to impossible with simulation.
The nature of flash technologies create some interesting verification challenges that need to be managed hours after the drive is first powered up: after one or more drive-fills. This new reality of drive performance makes simulating a complete system nearly impossible with traditional methods. It is usually only done for the first time with actual drive hardware and firmware, or with models that pre-load a possible drive-state that create interesting test cases. This can create some unpleasant surprises the first time the drive integration is done.
Power optimization, security, and compression are even more important than ever because the need for secure storage, using less power, is imperative in the data center.
As firmware increases in size and scope, FW development and verification needs to start earlier, typically happening concurrently with HW design to find bugs prior to tapeout.
Bringing HW emulation into the flow
Hardware emulation is a viable option for many parts of the SSD verification flow. In general, simulation gives full visibility, lets engineers easily force error conditions, verifies design blocks, and provides visibility into bugs found in the lab. On the other hand, FPGA prototyping allows for faster and more extensive testing, allows controller connections to external hardware used in or by the drive, and allows for firmware test development with real hardware, but it has limited visibility for debug and is much less flexible.
Emulation spans the gap between simulation and FPGA prototyping, as it’s faster than simulation, provides more visibility than the FPGA, allows controller connections to external hardware used in or by the drive, runs on real firmware, creates confidence before the FPGA prototype is available, and enables the same setup for both pre- and post-silicon verification. In addition, hardware emulation works in concert with FPGA prototyping for more effective debug and root-cause analysis.
Figure 2. Three foundations of storage verification.
In the traditional In-Circuit-Emulation (ICE) mode, an emulator connects with physical interface devices that allow the execution and debug of embedded systems. The debugger adaptor enables processor debug for firmware (rather than just a model) and use validation. The host speed adapter (e.g., PCIe, SATA, and SAS) allows a connection to host testers while reusing existing test scripts.
This setup enables designers to develop, test, and debug the full test suite against the SoC. Verification engineers are able to find SoC bugs prior to tapeout and reduce the number of tapeouts required to ship product. However, ICE comes with several limitations when it comes to verifying SSD designs. Fortunately, a virtual emulation environment addresses all of these problems.
In general, a virtual solution replaces a physical implementation with a model. A purely virtual solution only uses models — no cabling from the emulator to a physical device.
Some companies have had notable success using an approach that virtualizes the parts of the controller that are well understood at an interface level and that allows for much greater flexibility in making design and architectural changes — all while providing greater visibility into the DUT.
If the host design is PCIe/NVMe, for example, the interface itself is standardized and well known. If we could stimulate that interface in system simulation with something configurable enough to hit corner cases, but simple enough to make bring-up doable in a very short time-frame (without writing an entire testbench to reinvent the wheel), then that would cover a major portion of the controller testing.
At the same time, the NAND interfaces (both toggle and ONFI) are well-known, but the underlying 3D NAND technology and device physics are highly complex, and probably still under development if your controller is forward looking. That means the target device probably does not even exist, and there is only an early specification. However, if a model exists for that device, the same process done on the host interface can be done with the NAND interface. Simply drop in a model replacement.
But how do we know the model is good enough? To answer this question, some empirical data is useful. One company started their production firmware development at the same time as the hardware. They used Veloce soft models to emulate the NAND devices, and they found that the firmware that passed on the emulator [using soft models for NAND] had first-pass success when run on the real chip. By design, the DUT on the emulator is identical to the chip.
The fact is, a well-designed model speeds development time and moves firmware integration to very early in the project schedule such that any model differences with the actual NAND device, or physical host, are trivial compared to time lost implementing a physical ICE-based system. Illustrating the now-virtualized environment, we have a new picture of the system.
Figure 3. Virtual emulation deployment on Veloce.
Using virtual mode and testing a virtual system also allows the emulator to be part of a data center use model, enabling engineers to run simulations from their desk and share the emulator with multiple users at the same time. Firmware development in a virtual environment can start at the same time as design creation. Traditional storage firmware development and testing starts in earnest when the silicon is in the lab, but successful companies have proven software-driven design flows let firmware start with hardware definition. When this happens, the overall design time and the time-to-market shrinks.
The greatest advantage to virtualizing your verification is flexibility. Hard drive controller development rarely attempted targeting multiple variants of spinning media; however, that is exactly what SSD controllers must do. The ability to re-configure a design to target a completely new NAND device, and get accurate performance data prior to silicon, gives controller teams a big advantage.
As storage technology and use models continue to evolve, so do the verification tools needed to solve today’s challenges. The Veloce emulator is well suited to address these challenges, as has been proven by leading storage companies who are using it today in their production environments.
New storage technologies reveal new opportunities to use more flexible and powerful tools, including a virtual host machine driving PCIe traffic and an entire NAND configuration with the flexibility of soft models. This enables the emulator to run multiple jobs in parallel, creating efficiencies not possible using ICE mode. Moreover, Veloce’s save-and-restore capability allows designers to free up the emulator while they debug a previous run.
To take a closer look at the challenges of the storage market, the evolution of SSD, and how an emulation-based verification methodology offers design teams a significant advantage—as well as how to implement an SSD controller in Veloce—please download the new whitepaper Using Emulation to Deliver Storage Market Innovations.
Paul Morrison is a technical marketing engineer in the MED Solution Marketing group at Mentor, a Siemens Business.
Leave a Reply