SPONSOR BLOG

System Design Considerations For Embedded Heterogeneous Multiprocessing (HMP)

Integrating functionally asymmetric compute elements requires unique system design choices.

August 10th, 2017 - By: Kinjal Dave

Heterogeneous multiprocessor (HMP) systems, using functionally asymmetric compute elements, such as application processors and microcontrollers integrated within the same SoCs, are now used extensively across a wide range of applications. These SoCs are used in smart, connected devices to transform the way we live – at home, in the car, and in our cities – with even more intelligence, speed and efficiency.

A fundamental requirement for modern embedded systems is using the right compute element for a given task. It enables these systems to meet the conflicting requirements of delivering high performance, while improving the overall system efficiency of the system. However, architecting heterogeneous systems requires unique system and software considerations.

This blog focuses on two questions:

Why is heterogeneous computing such a fundamental requirement of the modern compute systems? and
What are the system design choices to be made when architecting such heterogeneous multiprocessing (HMP) systems to integrate functionally asymmetric compute elements (e.g. Cortex-A53 and Cortex-M4) in the same system?

The key challenge of modern compute systems
The most significant challenge for modern compute systems is the requirement to handle a diversity of workloads without compromising on system efficiency.

In order to meet these diverse compute requirements and improve the efficiency of these systems, SoC architects rely on integrating functionally asymmetric processors within the same SoC. However, integrating microcontrollers and application processors (which differ significantly in terms of ISA, performance and software), requires some key considerations at system level.

So, what are some fundamental considerations when architecting an energy-efficient, heterogeneous compute system?

System design considerations for architecting embedded HMP systems
There are several types of HMP systems. In a generic sense, HMP system refers to a complex system that combines several different compute elements like a general-purpose processor, a graphics processor, an image processor, a video processor, a display processor and possibly several accelerators. Fig. 1 shows a typical HMP compute system that includes several compute elements.

The context of this blog is to discuss the system design considerations for integrating Arm’s application processors (e.g Cortex-A53, or Cortex-A35) with microcontroller (e.g Cortex-M4, Cortex-M33) in the same SoC. Consider the generic compute subsystem shown in Figure 2, using the Cortex-A and Cortex-M processors.

Figure 1: A generic heterogeneous multiprocessor (HMP) compute system

The system designer needs to consider the following fundamental questions when designing heterogeneous compute systems using Cortex-A and Cortex-M processors:

How do you address the memory map differences?
How do you distribute interrupts across the application processor and the microcontroller subsystems?
How do you handle inter-processor communication?
How do you handle Secure/Non-secure state communication?

1. How do you address the memory map differences?
There are two approaches to approaching different memory map addresses: low area cost or more flexibility.

Low area cost:

Advantages: sharing a common address space, peripherals grouped together in a system
Disadvantages: restrictive; Requires design time decision

More flexibility with a System Memory Management Unit (SMMU):

Provides a moveable window, allowing accesses to addresses beyond 32-bit for the Cortex-M processor subsystem
Add security attribute to transactions, allowing access to both Secure and Non-secure resources (if needed)
Run-time configurable by software

Figure 2: Using an SMMU allows more flexibility for a processor to access a wider memory addressing space

2. How do you distribute interrupts?
It can be necessary to share interrupt sources between processors of different classes. Interrupt sources may need to be connected to both interrupt controllers, which is relatively simple for wired interrupts. NVIC would need wrapper logic to handle message-based interrupts. For example: A sensor which can be serviced by an always-on Cortex-M core, when the Cortex-A processors are asleep. The GIC architecture is intended for use with Cortex-A and Cortex-R class processors, however, there is no support for connecting Cortex-M processors to GICv3/v4 interrupt controllers. Cortex-M processors have their own interrupt controller, called Nested Vector Interrupt Controller (NVIC), which has a similar programmer’s model and functionality to the GIC.

Option 1: Wired interrupt

Advantage: Easy system design
Disadvantage: Higher software overhead when switching interrupt allocation
- NVIC configuration is accessible from the Cortex-M processor only
- GIC configuration might not be accessible from Cortex-M processor
- Use software mailbox to synchronize configuration changes (requires IPC)

Option 2: Message-based interrupt

Advantages: Small hardware cost; significant reduction in software overhead for interrupt allocation
Flexible design options, for example:
- Message from Cortex-A to Cortex-M
- Message from Cortex-M to Cortex-A
- Interrupt distribution unit (shared)

See Figure 3 below for the system design options.

Figure 3: Two system design options for interrupts, for example with Arm Cortex-A and Cortex-M processors

3. How do you handle inter-processor communication?
Software running on two different processors needs to be able to communicate with each other. There are two elements to this:

- Sending interrupt across to other processor(s)
- Shared memory for mail boxes / semaphores data

Such communication would typically be via mail boxes in shared memory. This would need to be memory that is part of the main system’s address space, so that, for example, both the Cortex-A processors and the Cortex-M subsystem have visibility.

Such mail boxes might be complimented by door-bell interrupts, to signal the presence of new messages or the completion of previous commands. This requires a mechanism for each processor to generate interrupts in the other’s interrupt controller. See an example system diagram in Figure 4 below.

Here are a few use cases of when this communication is required, using Cortex processors as an example:

Cortex-A system requesting system control activities from Cortex-M system controller
Cortex-M sensor hub reporting data to Cortex-A processors
Initiating hand over of a shared peripheral from one system to another

Figure 4: Handling inter-processor communication between Cortex-A, Cortex-R and Cortex-M processor subsystems

4. How do you handle Secure/Non-secure state communication?
Architecting security in modern compute systems is a necessary requirement to enable devices to counter specific threats that it might experience. Typical use cases include: the protection of authentication mechanisms, cryptography, key material and digital rights management (DRM).

Key considerations when implementing security in an HMP system (see Figure 5 for an example system diagram):

If you are combining processors that do not use TrustZone security extension, the compute subsystem using must be defined as always Secure (e.g. system control processor subsystem) or always Non-secure (e.g. audio subsystem)
Ensure that the debug system matches the security domains for each processor
System memory partitioning and interrupt distribution in the Secure/Non-secure worlds across the two processor subsystems
Secure and Non-secure memory partitioning must match between the different processor subsystems

Figure 5: An example of an HMP system with hardware-enforced security, using TrustZone security extension and Cortex processors

Needless to say, there are a number of other design considerations to bear in mind when designing heterogeneous multiprocessing (HMP) systems. This blog only scratches the surface of the system design choices – download the full whitepaper to see more hardware system diagrams and the software considerations, as well.

Kinjal Dave

(all posts)
Kinjal Dave is a product manager with ARM's CPU Group in Cambridge, UK.

System Design Considerations For Embedded Heterogeneous Multiprocessing (HMP)

Kinjal Dave

Leave a Reply Cancel reply

Technical Papers

Knowledge Centers
Entities, people and technologies explored

Related Articles

RISC-V’s Increasing Influence

Development Flows For Chiplets

New Data Center Protocols Tackle AI

Chiplet Tradeoffs And Limitations

Implementing AI Activation Functions

Die-to-die Interconnect Standards In Flux

The Best DRAMs For Artificial Intelligence

Future-proofing AI Models

Sponsors

Recent Comments

About

Navigation

Connect With Us

System Design Considerations For Embedded Heterogeneous Multiprocessing (HMP)

Kinjal Dave

Leave a Reply Cancel reply

Technical Papers

Knowledge Centers Entities, people and technologies explored

Related Articles

RISC-V’s Increasing Influence

Development Flows For Chiplets

New Data Center Protocols Tackle AI

Chiplet Tradeoffs And Limitations

Implementing AI Activation Functions

Die-to-die Interconnect Standards In Flux

The Best DRAMs For Artificial Intelligence

Future-proofing AI Models

Sponsors

Newsletter Signup

Popular Tags

Recent Comments

About

Navigation

Connect With Us

Knowledge Centers
Entities, people and technologies explored