Arm A-Profile Architecture Developments 2023

An overview of the latest features included in the Arm ISA, from FP8 to detection of memory safety violations.

popularity

As computing demands continue to evolve with the rise of artificial intelligence (AI) and advancing security threats, it is imperative that the foundational computing architecture at the heart of the world’s devices continues to evolve. This is why our engineering teams add new features and technologies to the pervasive Arm architecture, with the software teams then ensuring that software lands on these future features and technologies as seamlessly as possible.

How the Arm architecture is developed

Arm releases annual updates to the Arm Instruction Set Architecture (ISA) which are created in collaboration with our diverse set of partners from across the Arm ecosystem. The process involves silicon partners, operating system vendors and OEMs, Arm’s internal engineering teams and standards bodies.

A strongly curated ISA ensures that software continues to work on historic and new hardware for years to come. Arm works closely with Linaro and a host of other partners to enable the Arm ISA in the most widely used software upstream communities, such as Linux kernel and distros, to help deliver the broadest developer ecosystem on the planet.

Each September, we release a blog which discusses some of the key additions to the A-Profile architecture in that year. Alongside the blog, we release full Instruction Set and System Register documentation via our developer web pages.

The complete Arm Architecture Reference Manual (Arm ARM) is also updated annually. An update to include the 2023 extensions is due for release in early 2024. Updates to the ‘Learn the Architecture’ pages will also appear during 2023 and 2024.

Publishing the blog and documentation is only one step in deploying new architecture. The next step will be working with our ecosystem partners to ensure that open-source software is enabled to make use of this functionality as soon as the hardware becomes available.

In 2023, Arm is introducing features to support our ongoing focus on artificial intelligence (AI), machine learning (ML) and security. Enabling secure AI everywhere is a key priority for the Arm architecture, with the training of Neural Networks (NNs) critical to the continued development and advancement of AI. This is why the 2023 architecture extensions include a new 8-bit floating-point format called FP8 that is already seeing rapid adoption across NNs. For security, we are adding Checked Pointer Arithmetic, which builds on existing support for Arm Memory Tagging Extension (MTE) that allows developers to detect memory safety violations quickly, saving them costs and time during the application development process.

Details of previous updates to the A-Profile architecture are available here: 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021 and 2022.

Let’s look at some of the new features we’ve added this year.

Floating Point 8 (FP8)

In 2022 Arm, Intel, and Nvidia announced their collaboration on FP8, an interchange format that allows software ecosystems to share NN models easily and support the continuous advancement of AI computing capabilities. As part of the 2023 extensions, FP8 support is added to SME2, SVE2 and Advanced SIMD (Neon).

FP8 supports two data formats: E5M2 and E4M3. These two formats give different trade-offs between precision and range.

The format which is used is selected by fields in the FPMR register. Different formats can be selected for the different inputs to an instruction, allowing for efficient working with datasets in different formats. We firmly believe in the benefits of the industry coalescing around one 8-bit floating point format, enabling developers to focus on innovation and differentiation where it really matters. We are excited to see how FP8 advances AI development in the future.

Live migration

Live migration is the process of moving a virtual machine (VM) from one host to another, while preserving availability and state. Support for efficient live migration is an important tool for large-scale data center management.

To implement live migration, a hypervisor copies pages to the new host while the VM is still running on the old host. This typically requires an iterative process, as the VM might ‘dirty’ a page that has already been copied. There are different approaches to solving this problem, but they must all contend with three challenges:

  • Recording: Creating a record of the pages the VM has written to (dirtied).
  • Surveying: Processing the records to determine which pages need to be re-copied.
  • Cleaning: Resetting the recoding mechanism on each iteration.

The 2023 extensions introduce features to help optimize all three of these.

FEAT_HDBSS adds the ability to record a log of the stage 2 pages or blocks dirtied. This mechanism addresses the Recording cost, as the memory management unit (MMU) can efficiently create the log without interrupting execution of the VM. The log also addresses the Surveying cost, as the generated data is in a format that hypervisors can efficiently consume.

To address the Cleaning cost, FEAT_HACDBS adds an accelerator for cleaning the dirty state in the stage 2 translation tables. The engine uses the log of dirtied pages to locate the stage 2 translation table descriptors that need to be updated.

Together these features can give significant performance and efficiency improvements to live migration.

Checked pointer arithmetic

AArch64 supports features which re-purpose the upper bits of registers holding addresses. For example, Tagged Pointers introduced in Armv8.0-A and MTE introduced in Armv8.5-A.

Software frequently needs to manipulate pointers, for example adding an offset to a base address. This is typically done using regular arithmetic operations, such as add or subtract. An overflow on the address calculation could lead to the non-address bits being corrupted. For example, if MTE is being used, the address manipulation could cause the Tag stored in the pointer to be changed. A corrupted tag might lead to the processor not detecting a memory safety violation as illustrated below:

The 2023 extensions introduce new instructions specifically intended for operating on pointers. These instructions incorporate multiple pointer specific checks, including checking whether bits[63:56] are modified and protected against overflow. Load and store instructions with <base+offset> addressing modes can also be configured to preserve bits[63:56].

Taking the previous MTE example, the new features allow the processor to detect if the top 8 bits of the pointer have been modified. This means that if the MTE tag were corrupted it would be reported back to software.

Other functionality

Other enhancements introduced as part of the 2023 extensions include:

  • Support for using a combination of the PC (Program Counter) and the SP (the currently selected Stack Pointer) as the modifier when generating or checking Pointer Authentication codes.
  • Support for Realm Management Extension (RME) enabled designs, support for non-secure only in the Granule Protection Tables and the ability to disable certain Physical Address Spaces (PAS).
  • EL3 configuration write-traps.
  • Breakpoint support for address range and mismatch triggering without the need for linking.
  • Support for efficiently delegating SErrors from EL3 to EL2 or EL1.

Summary

This blog provides a brief introduction to the latest features included in the Arm architecture as Armv9.5-A. More detailed information can be found on our Developer website.

Over the coming months, Arm will be working with our partners to ensure that the software ecosystem is enabled to utilize these features as soon as future processors become available.


Tags:

Leave a Reply


(Note: This name will be displayed publicly)