Constructing The Pillars Of The ARM HPC Ecosystem

Alternative HPC architectures will only happen if a strong supporting software ecosystem is in place.

popularity

In talking with HPC users at SC15 following the announcement of the OpenHPC project, I consistently heard that while they valued having a common open source framework covering a baseline set of HPC codes, they wanted to see more than one chip architecture represented. This is important when you consider that many HPC users are focused on getting to exascale computing for future supercomputer deployments. They need the flexibility of shifting to alternative architectures for their next deployment if it offers better energy-efficiency, density, performance, and scalability for their compute needs. However, this will only happen if a strong supporting HPC software ecosystem for that architecture is in place.

Efforts were already underway from ARM to build up our HPC software ecosystem and we immediately saw that OpenHPC aligned well with those efforts. In June, we were officially announced as a founding member of OpenHPC and less than six months later, I’m pleased to announce that ARMv8-A will be the first alternative architecture with OpenHPC support. The initial baseline release of OpenHPC for ARMv8-A will be available as part of the forthcoming OpenHPC v1.2 release at SC16. This is yet another milestone that levels the playing field for the ARM server ecosystem and will accelerate choice within the HPC community.

Collaborating for choice in HPC
The ARM ecosystem is unlike any other in the industry today. ARM partners are shipping a diverse range of SoC-based solutions (15 billion shipped in 2015) that can leverage a common software ecosystem. Our server ecosystem has long leveraged the advantages of a development model based on an open source framework and we’ve followed the same strategy here in support of the HPC community.

Working with a couple key partners, namely Cavium and SUSE, we collaborated to enable the OpenHPC build environment with the latest ARM-based hardware and operating system support. Cavium provided their latest dual-socket servers based on ThunderX for installation at the TACC site in Austin, TX and SUSE has just recently announced full ARMv8-A support in SUSE Linux Enterprise Server 12. Taken all together, you now have a complete open and standards-based HPC platform covering hardware, OS, and a full set of community-defined HPC software tools, all pre-built and tested on the ARMv8-A architecture.

As a founder and board member of OpenHPC, SUSE has contributed the Linux OS elements along with the underlying HPC system building componentry and system tools. Cavium, with its ThunderX processors based on the ARMv8-A architecture, already has several HPC end-user and system engagements underway. Some public examples are Cavium’s deployment at the Hartree Centre in the UK and their engagement with the Barcelona Supercomputing Centre in Spain. Currently, OpenHPC supports releases for SUSE and CENTOS OS’s and our initial ARM release in OpenHPC v1.2 will cover both as well.

Fine tuning for peak HPC performance
Complementing our open source software strategy, ARM continues to make great progress in its commercial HPC tool set. Reaching efficient exascale computing requires developer tools to support fine tuning peak HPC performance on ARM-based partner platforms. To help achieve this, ARM is announcing a pair of commercial Linux user-space compilers running natively on ARMv8-A hardware and generating code for the current and future generations of the ARMv8-A architecture. Both compilers will initially support C/C++, with Fortran support on the way in 2017. We’ve split the compiler products into two choices, based on the user needs:

  • ARM Compiler for HPC: Optimized for Neon vectorization on ARMv8-A, the compiling environment combines the ARM Performance Libraries with a commercially-supported ARM-native compiler.
  • ARM SVE Compiler for HPC: Supports the recently announced Scalable Vector Extensions (SVE). Includes compiler auto-vectorization passes, SVE-tuned kernels in the ARM Performance Libraries, and the ARM Instruction Emulator, which allows SVE application binaries to execute on non-SVE ARMv8-A systems available today.

The SVE extension to the ARMV8-A architecture significantly extends the vector processing capabilities associated with AArch64 execution in the ARM architecture, enabling implementation choices for vector lengths that scale from 128- to 2048-bits. While availability of server SoC designs based on ARMv8-A SVE are still a few years down the road, it’s important that we begin work with developers now to ensure a robust HPC software ecosystem is in place to support those designs.