High-value datasets containing sensitive information must be shielded from unauthorized access.
AI has permeated virtually every aspect of our digital lives, from personalized recommendations on streaming platforms to advanced medical diagnostics. Behind the scenes of this AI revolution lies the data center, which houses the hardware, software, and networking infrastructure necessary for training and deploying AI models. Securing AI in the data center relies on data confidentiality, integrity, and authenticity throughout the AI lifecycle, from data preprocessing to model training and inference deployment.
High-value datasets containing sensitive information, such as personal health records or financial transactions, must be shielded from unauthorized access. Robust encryption mechanisms, such as Advanced Encryption Standard (AES), coupled with secure key management practices, form the foundation of data confidentiality in the data center. The encryption key used must be unique and used in a secure environment. Encryption and decryption operations of data are constantly occurring and must be performed to prevent key leakage. Should a compromise arise, it should be possible to renew the key securely and re-encrypt data with the new key.
The encryption key used must also be securely stored in a location that unauthorized processes or individuals cannot access. The keys used must be protected from attempts to read them from the device or attempts to steal them using side-channel techniques such as SCA (Side-Channel Attacks) or FIA (Fault Injection Attacks). The multitenancy aspect of modern data centers calls for robust SCA protection of key data.
Hardware-level security plays a pivotal role in safeguarding AI within the data center, offering built-in protections against a wide range of threats. Trusted Platform Modules (TPMs), secure enclaves, and Hardware Security Modules (HSMs) provide secure storage and processing environments for sensitive data and cryptographic keys, shielding them from unauthorized access or tampering. By leveraging hardware-based security features, organizations can enhance the resilience of their AI infrastructure and mitigate the risk of attacks targeting software vulnerabilities.
Ideally, secure cryptographic processing is handled by a Root of Trust core. The AI service provider manages the Root of Trust firmware, but it can also load secure applications that customers can write to implement their own cryptographic key management and storage applications. The Root of Trust can be integrated in the host CPU that orchestrates the AI operations, decrypting the AI model and its specific parameters before those are fed to AI or network accelerators (GPUs or NPUs). It can also be directly integrated with the GPUs and NPUs to perform encryption/decryption at that level. These GPUs and NPUs may also select to store AI workloads and inference models in encrypted form in their local memory banks and decrypt the data on the fly when access is required. Dedicated on-the-fly, low latency in-line memory decryption engines based on the AES-XTS algorithm can keep up with the memory bandwidth, ensuring that the process is not slowed down.
AI training workloads are often distributed among dozens of devices connected via PCIe or high-speed networking technology such as 800G Ethernet. An efficient confidentiality and integrity protocol such as MACsec using the AES-GCM algorithm can protect the data in motion over high-speed Ethernet links. AES-GCM engines integrated with the server SoC and the PCIe acceleration boards ensure that traffic is authenticated and optionally encrypted.
Rambus offers a broad portfolio of security IP covering the key security elements needed to protect AI in the data center. Rambus Root of Trust IP cores ensure a secure boot protocol that protects the integrity of its firmware. This can be combined with Rambus inline memory encryption engines, as well as dedicated solutions for MACsec up to 800G.
Resources
Leave a Reply