Chinese company leverages homegrown technology in supercomputing race.
New versions of the Top500 and Green500 lists have been released, and Frontier continues its reign at Number. 1. But a newcomer, Aurora, using Intel’s Sapphire Rapids, has entered at the Number 2 position with a “half-scale” system.
Both machines are HPE Crays, with the former using AMD optimized third-gen EPYC 64C at 2.0GHz and AMD Instinct MI250X, while the latter uses Intel Xeon CPU Max 9470 52C at 2.4GHz and Intel Data Center GPU Max. Frontier clocks in with an Energy Efficiency score of 52.592 GFLOPS/W (Number 8 on the Green500 list) while Aurora currently scores 23.711 GFLOPS/W (Number 38 on the Green500 list).
I previously wrote about the Green500 seven years ago in an article titled Green Computing: GPUs Strike Back. At that time, China’s NRCPC’s Sunway TaihuLight was sitting at the top of the Top500 list. So what has happened to the lead that China had at that time in supercomputing? Apparently Chinese researchers are still very active in the development of advanced machines, and news at times has leaked out about Exascale machines existing in China, too.
While there haven’t been formal submissions to the Top500 list, at last month’s SC23 Super Computing Conference there were two submissions for Gordon Bell Prize Awards that included work using new Sunway supercomputers. The first submission is “Towards Exascale Computation for Turbomachinery Flows,” and was Finalist 2 in the competition. The work was calculated on Wuxi’s new Sunway supercomputer where each computation node consists of 384 calculation cores and the total system contains up to 19.2 million cores.
The second submission was made to the new for 2023 category of the ACM Gordon Bell Prize for Climate Modelling. Again, a Sunway entry titled “Establishing a Modeling System in 3-km Horizontal Resolution for Global Atmospheric Circulation Triggered by Submarine Volcanic Eruptions with 400 Billion Smoothed Particle Hydrodynamics,” finished as Finalist 2 in the competition. Here it was reported that the modeling system was able to use 400 billion particles with 80% parallel efficiency, using 39 million processor cores — or approximately twice as many cores as the previously mentioned work.
In the paper, “5 ExaFlop/s HPL-MxP Benchmark with Linear Scalability on the 40-Million-Core Sunway Supercomputer”, Rongfen Lin, et. al. describe a system consisting of more than 107,520 SW26010-Pro CPUs, with a parallel scale of 41,932,800 cores. The 5.048 ExaFlop/s score for HPL-MxP Benchmark is good enough to place it in second place behind Frontier’s 9.95 ExaFlop/s score. The Sunway’s FP16 efficiency is at 85%, which is better than Frontier’s 74% and highest among all heterogenous manycore systems in the HPL-MxP list, but it’s not as high as Fugaku’s 93%. Still, this is a very notable achievement for research work using a “homegrown” processor and architecture and not relying on the big-name manufacturers used in many of the other systems in the Top500. Hopefully we’ll see the Sunway machine’s scores being submitted to future Top500 lists so that the researchers working on these systems will get the recognition that they deserve for their work in supercomputing.
Fig. 1: Hardware architecture of the SW26010-pro processor. Source: 5 “ExaFlop/s HPL-MxP Benchmark with Linear Scalability on the 40-Million-Core Sunway Supercomputer.“
Leave a Reply