Securing AI/ML Training And Inference Workloads

Key elements and actions for preserving data integrity.

popularity

AI/ML can be thought about in two distinct and essential functions: training and inference. Both are vulnerable to different types of security attacks and this blog will look at some of the ways in which hardware-level security can protect sensitive data and devices across the different AI workflows and pipelines.

The security challenges encountered with AI/ML workloads can be addressed by implementing data confidentiality, integrity and authenticity. This needs to occur at the main stages of the AI workflow during both training and inference. This applies to the data center, as well as edge devices, from gateways to end points such as smartphones and IoT devices.

High-value training data sets need to be protected not only from theft, but also from so-called data poisoning. Encryption with integrity is essential to keep training data confidential and unaltered. The encryption key used must be unique to the customer and used in a secure environment. Encryption and decryption operations of data are constantly occurring and must be performed in a manner which prevents leakage of the key. Should a compromise arise, it should be possible to renew the key securely and re-encrypt data with the new key.

In addition, the encryption key used must be securely stored in a location that unauthorized processes or individuals cannot access. The keys used must be protected from attempts to read them from the device or attempts to steal them using side-channel techniques such as SCA (Side-Channel Attacks) or FIA (Fault Injection Attacks). This is especially important for edge AI and processing devices that are easy to access by adversaries. But even in data centers, where AI processing devices are located in secure rooms with 24/7 surveillance, the multitenancy aspect of modern data centers calls for SCA protection of key materials.

SCAs use electromagnetic or power emissions from semiconductors to derive the keys used during cryptographic operations. An attacker can place a probe near a device while that device is performing encryption or decryption operations, capture samples of the electromagnetic or power emissions of the device and use statistical analysis to deduce the encryption key used. Once in possession of the key, the attacker can then recover the protected inference model and the inference data. AI-enabled devices, the number of which are growing dramatically, need to implement SCA countermeasures to the cryptography mechanisms already used to protect inference data and models.

In addition to the secure storage of keys, secure cryptographic processing of AI training data and models is also required. With secure cryptographic processing, the decryption/encryption operations are performed in a secure enclave that is only accessed by privileged operations. Ideally, secure cryptographic processing is handled by a Root of Trust core. The AI service provider manages the Root of Trust firmware, but it can also load secure applications that customers can write to implement their own cryptographic key management and storage applications.

The Root of Trust can be integrated in the host CPU that orchestrates the AI operations, decrypting the AI model and its specific parameters before those are fed to AI or network accelerators (GPUs or NPUs). It can also be directly integrated with the GPUs and NPUs to perform encryption/decryption at that level. These GPUs and NPUs may also select to store AI workloads and inference models in encrypted form in their local memory banks and decrypt the data on the fly when access is required. Dedicated on-the-fly, low latency in-line memory decryption engines based on the AES-XTS algorithm can keep up with the memory bandwidth, ensuring that the process is not slowed down.

AI training workloads are often distributed among dozens of devices connected via PCIe or high-speed networking technology such as 800G Ethernet. An efficient confidentiality and integrity protocol such as MACsec using the AES-GCM algorithm can protect the data in motion over high-speed Ethernet links. AES-GCM engines integrated with the server SoC and the PCIe acceleration boards ensure that traffic is authenticated and optionally encrypted.

There is a broad collection of security IP covering these key security elements for AI/ML workloads. Rambus Root of Trust IP cores, for example, ensure a secure boot protocol that protects the integrity of its firmware. This can be combined with inline memory encryption engines, as well as dedicated solutions for MACsec up to 800G. These protocol engines contain input/output interfaces intended for supplying key material and can work in collaboration with the Root of Trust, whereby the Root of Trust stores and generates the keys used by the high-speed crypto engines.

Resources



Leave a Reply


(Note: This name will be displayed publicly)