Combining wired networking and computational resources on the same card to offload tasks from server CPUs.
Network interface cards (NICs) have been on the market since shortly after the first PCs in the mid-1980s. However, over the past few years, we’ve seen the emergence of SmartNICs.
What is a SmartNIC? The most basic definition of a SmartNIC is simply a programmable NIC. Others have overloaded the concept by heaping vast amounts of silicon and firmware into their implementations. A good working definition is that a SmartNIC is a NIC that includes additional computational resources exposed to the customer, along with the necessary open-source tools to utilize those resources. The additional computational resources process network traffic both as it enters and exits the server as well as offload the host CPU at the application level.
SmartNICs are the fusion of wired networking and computational resource on the same card. These computational resources can be composed of one or more of the following categories: classical x86 CPUs like Arm cores, purpose-built cores for digital signal processors (DSPs), artificial intelligence (AI), networking processing units (NPUs), or field programmable gate arrays (FPGAs). It’s not uncommon for more than one of the above computational elements to be included on a SmartNIC.
Every server connects to a network using a NIC. Sometimes these are embedded wireless connections that typically support Internet of Things (IoT) devices like cameras and thermostats, but most servers are wired to the network. They use wired for many reasons, but the two most prominent are performance and availability.
With availability, a wired network only fails when the cable is damaged or removed. For network performance, we focus on two metrics: bandwidth, the volume of data you can move through the network, and latency, the time you spend waiting to move one piece of data.
With data center networking today at 25 GbE and rapidly moving to 50 GbE and 100 GbE, additional computational resources on a SmartNIC need to be carefully considered. The most efficient use of traditional CPU cores like those from Arm is to reserve them for control-plane management. For example, one dual- or quad-core Arm complex is often used for control-plane management tasks like loading software into other computational units and logging.
Data center NICs today handle millions, and potentially more than one hundred million, network packets per second. Arm cores, even those clocked at 3 GHz, aren’t up to the task of inspecting and acting upon millions, much less tens of millions, of packets per second each. There aren’t enough instructions per second to keep up with these volumes. Special purpose computational resources like dedicated network processors, FPGAs, or GPU cores are needed to process these volumes.
The parallel processing and programmable logic of FPGAs often make them best suited for the task. They can configure to parse the network packet header rapidly or even the body. Then take the necessary action from dropping the packet to wrapping it or changing the contents entirely at line-rate. An excellent example of an FPGA-based SmartNIC that includes an Arm complex and a network processor is the Xilinx Alveo SN1000 SmartNIC.
CPU offload is a significant value proposition for a SmartNIC. Computationally intensive tasks like hashing for blockchains and transcoding video can be handled by the SmartNIC itself, freeing up precious server CPU resources.
Blockchains rely on solving proof of work. The first node on a network that reaches a solution is provided a reward, and then permitted to bundle up and publish the next block on the chain. SmartNICs can hold the blockchain and pending transactions in memory while computing the next solution. If they win, then the SmartNIC publishes the block and moves on to the next block.
Video transcoding is another popular host CPU offload that lends itself well to SmartNICs. Transcoding video using adaptive-bitrate (ABR) compression to support mobile devices is another CPU-intensive task, particularly for live video applications. These compression tasks are extremely linear and have been ported to FPGA-based accelerators where they’ve proven to be 10X to 20X more efficient than general-purpose CPUs.
A SmartNIC could also include a basic Netfilter firewall, offloading the host CPU from filtering all inbound and outbound packets. Netfilter is the new version of iptables, and it provides a very robust architecture for filtering network traffic. Offloading this firewall to a SmartNIC could save the host CPU millions of instructions per second that could then be applied to the applications running on that server.
We also have packet wrapping, known as encapsulation. Whenever we utilize overlay networks for virtualized or containerized systems, we need to wrap the network packets so that they can be routed between these overlay networks. An example of overlay network processing is Open vSwitch (OvS), which can be very CPU intensive, so offloading this task to a SmartNIC frees up significant host CPU cycles.
Finally, we could also offload primary network applications that might typically run on the server like DNS or in-memory databases. Processing DNS queries entirely within the SmartNIC is a typical SmartNIC application as the transactions are small and the table lookups are quickly processed.
A SmartNIC can also double as a storage controller. Some SmartNICs, like Xilinx’s Alveo U25, have both on-chip and on-board memory in the gigabytes (6 GB in the case of the U25) of their own local storage. This storage can double as a cache to the server’s own NVMe disks. This will become important, as protocols like Compute Express Link (CXL) enable future SmartNICs to manage the relationship with NVMe drives directly.
SmartNICs could also do erasure coding in the hardware as well as storage encryption. For drive encryption, SmartNICs offer a unique security angle. If a SmartNIC encrypts or decrypts data going to NVMe storage, then both elements are required if someone wishes to break the encryption. If an admin removes the drives to decrypt them elsewhere, they would then need brute force to guess the missing encryption keys that were left behind on the SmartNIC.
SmartNICs can easily employ cryptography to secure their keys between power cycles, further making the system both robust and secure. Solarflare (now part of Xilinx), for example, has maintained a hardware security enclave on the NIC to store the NIC’s keys within its X2 silicon for the past several years. Future SmartNIC security enclaves could potentially save and secure hundreds of thousands of security keys for SSL/TLS end-point encryption.
One final special case where SmartNICs shine is ultra-low-latency electronic trading. We’re talking about moving network packets in tens of billionths of a second. Today, latency on high-performance 25-GbE NICs is in the range of 1,000ns. With a properly architected system, the right software, and a tuned SmartNIC, network packets can be analyzed as they’re being received, four bytes at a time. The response packet can then be injected into the network in a blindingly fast 22ns. This is over 40X faster than traditional high-performance NICs. When deployed in electronic trading, the return on investment (ROI) for these SmartNICs can sometimes be measured in fractions of a second.
As cloud service providers scale capacity upwards, they are increasing their deployment of SmartNICs to free up valuable CPU cores for business applications and optimizing server utilization. Today’s servers often spend 30% of their CPU cycles managing networking. That would be like having one new server for nearly every three in production. SmartNICs enable system architects to place high-performance computing resources at the very edge of the server—the network. SmartNICs can then be leveraged to protect the server, and therefore the enterprise, while also dramatically offloading the much more expensive server CPUs.
According to market research firm Dell’Oro Group, the SmartNIC market is forecast to surpass $600M and comprise 23% of the worldwide Ethernet adapter market by 2024, and we’ve seen a new generation of SmartNICs from companies like Broadcom, Intel, Mellanox, and Xilinx. So, when you’re designing your next data-center deployment, instead of defaulting to the standard NIC coming with your server, perhaps consider how SmartNICs might fit into your plans.
Leave a Reply