SPONSOR BLOG

Securing Accelerator Blades For Datacenter AI/ML Workloads

Why security for these devices is different, and what to do about it.

October 25th, 2022 - By: Emma-Jane Crozier

Data centers handle huge amounts of AI/ML training and inference workloads for their individual customers. Such a vast number of workloads calls for efficient processing, and to handle these workloads we have seen many new solutions emerge in the market. One of these solutions is pluggable accelerator blades, often deployed in massively parallel arrays, that implement the latest state-of-the-art neural processing architectures. These blades handle valuable inference models, algorithms, and training data, and as such they require a high level of protection.

Machine learning assets face many different threats. These can include input attacks that maliciously attempt to try and influence AI systems into making alternate decisions or the theft of valuable assets such as inference models, algorithms, and training data. Attacks can target software, firmware, hardware, or all these assets. They can be invasive or non-invasive. They can enter across the network, through an edge node, or they can directly target endpoints. With more and more AI-powered devices in our everyday lives, any attack on them can threaten privacy, property, and personal safety.

Accelerator blades contain several key components. The heart of the system is provided by powerful accelerator chips, ranging from a handful to a large array of dedicated AI/ML processing units, each with their own pool of attached memory. They process as many tasks, on as much data as possible, with the shortest latency. Often, there is also a Gateway CPU, with its own dedicated Flash and DDR, which manages models and assets, and programs and controls the accelerators. Finally, there is a connection to the fabric, offered by high-speed network or PCI Express (PCIe) interfaces.

Accelerator blades need to follow security requirements that at a minimum, authenticate and protect the blade itself. However, there are several additional security requirements needed for AI/ML acceleration. As mentioned, protecting assets is, of course, a primary concern. This can include protecting assets from theft or replacement, and ensuring that data privacy regulations, such as HIPAA in the USA and GDPR in Europe, are adhered to. When accelerator blades are installed in public cloud servers, they are usually assigned to handle multiple users or tenants. In this case, the ability to switch between different users or tenants in a secure way is extremely important. Finally, security is also required to avoid system misuse, to ensure proper billing of services provided, and prevent unethical use of the system.

According to the seminal Microsoft white paper on the topic, there are seven properties of highly secure devices: hardware root of trust, defense in depth, a small, trusted computing base, dynamic compartments, password-less authentication, error reporting, and renewable security. Each of these merits a detailed examination on their own, but for the purpose of this blog, we’ll be discussing some security use cases for the hardware root of trust (RoT) in the context of securing accelerator blades.

One of the major security use cases is ensuring the availability of the accelerators themselves. An adversary can tamper with accelerator hardware to deny or disrupt usage or bypass its security measures. A root of trust can monitor system status and memory content and detect tampering activity independently of the applications and the CPUs or MBUs. The root of trust can also detect security attacks like fault injection.

So, how exactly would this work? The root of trust and the Gateway CPU monitors test and debug logic, hardware configuration, other hardware status in the SoC. The root of trust in the accelerator monitors AI accelerator operation. The root of trust periodically hashes known embedded SRAM state to detect tampering, and it can also periodically hash invariant flash data. Internal logic inside the root of trust detects physical attacks to the system, and a security protocol engine can monitor network traffic. The security applications running inside the secure boundary of the root of trust then determine how to act upon a detected anomaly.

Secondly, we identified earlier that the inference and training models are valuable assets that require protection. While in use or while loaded into the AI accelerators, these models can be intercepted, replaced, or altered. Upon completion of training, the resulting inference models need to be stored in encrypted form and decrypted dynamically upon use.

A root of trust implementation for this scenario would include the following steps. The signed encrypted inference model is stored in an off-chip flash module. The root of trust module reads the inference models from flash, decrypts them, and hashes the decrypted data. The root of trust verifies the signature and compares hashes. Only if the hashes match, will the model be loaded into the accelerator. Alternatively, if each accelerator has its own flash, the local root of trust can handle this. The root of trust would provide encryption, hashing, and digital signature verification capabilities.

Finally, when a complete AI ecosystem operates in inference mode, an adversary can target or tamper with the inference process or the inference results. Protecting the inference results can be done using a secure channel, like that used to protect input data integrity. In this case, the host module communicates with the edge devices over the network, and there will be mutual authentication using pre-provisioned keys and identities. After establishing a secure communication channel, the root of trust and the edge device manage the passing of inference results to the servers. Once loaded in the AI accelerator, there is an integrity check of the inference results before committing them.

Rambus has three decades of security expertise and a broad portfolio of hardware security IP solutions designed to support the security needs of high-performance data centers handling valuable data. We have root of trust solutions tailored to the needs of state-of-the-art accelerators for AI/ML training, and lightweight solutions appropriate to inference engines in IoT devices.

Additional Resources:

Emma-Jane Crozier

(all posts)
Emma-Jane Crozier is a marketing manager at Rambus.

Knowledge Centers
Entities, people and technologies explored

Startup Funding: Q1 2025

AI chips and data center communications see big funding; 75 startups raise $2 billion.

by Jesse Allen

Advanced Packaging Fundamentals for Semiconductor Engineers

New SE eBook examines the next phase of semiconductor design, testing, and manufacturing.

by Bryon Moyer

Chip Industry Week in Review

AI export rule to be scrapped; SEMI, EU request; Cadence, Nvidia supercomputer; AI co-processor; Imagination's new GPU; semi sales up; imec, TNO photonics lab; NSF key to national security; flexible packaging control system; SiConic test engineering; USB 4 support; SiC JFETS; magnetic behavior in hematite.

by The SE Staff

Securing Accelerator Blades For Datacenter AI/ML Workloads

Emma-Jane Crozier

Leave a Reply Cancel reply

Technical Papers

Knowledge Centers
Entities, people and technologies explored

Related Articles

Startup Funding: Q1 2025

Advanced Packaging Fundamentals for Semiconductor Engineers

Chip Industry Week in Review

Chip Industry Week in Review

RISC-V’s Increasing Influence

Chip Industry Week in Review

Big Changes Ahead For Interposers And Substrates

What Exactly Are Chiplets And Heterogeneous Integration?

Sponsors

Recent Comments

About

Navigation

Connect With Us

Securing Accelerator Blades For Datacenter AI/ML Workloads

Emma-Jane Crozier

Leave a Reply Cancel reply

Technical Papers

Knowledge Centers Entities, people and technologies explored

Related Articles

Startup Funding: Q1 2025

Advanced Packaging Fundamentals for Semiconductor Engineers

Chip Industry Week in Review

Chip Industry Week in Review

RISC-V’s Increasing Influence

Chip Industry Week in Review

Big Changes Ahead For Interposers And Substrates

What Exactly Are Chiplets And Heterogeneous Integration?

Sponsors

Newsletter Signup

Popular Tags

Recent Comments

About

Navigation

Connect With Us

Knowledge Centers
Entities, people and technologies explored