Mont-Blanc: European scalable and power efficient HPC platform based on low-power embedded technology

Description

There is a continued need for higher compute performance: scientific grand challenges, engineering, geophysics, bioinformatics, etc. However, energy is increasingly becoming one of the most expensive resources and the dominant cost item for running a large supercomputing facility. In fact the total energy cost of a few years of operation can almost equal the cost of the hardware infrastructure. Energy efficiency is already a primary concern for the design of any computer system and it is unanimously recognized that Exascale systems will be strongly constrained by power.The analysis of the performance of HPC systems since 1993 shows exponential improvements at the rate of one order of magnitude every 3 years: One petaflops was achieved in 2008, one exaflops is expected in 2020. Based on a 20 MW power budget, this requires an efficiency of 50 GFLOPS/Watt. However, the current leader in energy efficiency achieves only 1.7 GFLOPS / Watt. Thus, a 30x improvement is required.

In this project, we believe that HPC systems developed from today's energy-efficient solutions used in embedded and mobile devices are the most likely to succeed. As of today, the CPUs of these devices are mostly designed by ARM. However, ARM processors have not been designed for HPC, and ARM chips have never been used in HPC systems before, leading to a number of significant challenges.The Mont-Blanc project has three objectives:

  • To develop a fully functional energy-efficient HPC prototype using low-power commercially available embedded technology
  • To design a next-generation HPC system together with a range of embedded technologies in order to overcome the limitations identified in the prototype system
  • To develop a portfolio of exascale applications to be run on this new generation of HPC systems.

This will produce a new type of computer architecture capable of setting future global HPC standards that will provide Exascale performance using 15 to 30 times less energy.

Funding