BSC research accelerates HPC workloads with less power-hungry DRAM

21 July 2020
Intelligent allocation of data to Intel Optane persistent memory stores more data close to CPU with fewer power-hungry DIMMs.

BSC together with INTEL develops software to automate the data distribution process in these systems efficiently.

The work is done under the Intel-BSC Exascale Lab, and in collaboration with the EPEEC project

Barcelona Supercomputing Center (BSC), under the Intel-BSC Exascale Lab, and in collaboration with the EPEEC project, led by BSC, is heading development of new software tools and expanding the software ecosystem for 2nd Generation Intel Xeon Scalable processors and Intel Optane persistent memory (Intel® Optane PMem). This work is helping accelerate High Perfor­mance Computing (HPC) applications using heterogeneous memory architectures.

The BSC researcher Antonio Peña is in charge of this research to explore how to accelerate large HPC workloads by leveraging heterogeneous memory systems. With Intel Optane PMem and 2nd Generation Intel Xeon Scalable processors, he is driving breakthrough architectures that enable high-performance workloads with large datasets on HPC clusters using less power than DRAM.

“Right now, many HPC applications are constrained by the amount of DRAM in the nodes and cluster,” Peña explained. “They need more and more memory but adding larger and more DIMMs with the current technology is not feasible due to the power constraints on the overall system.”

“We’re trying to reduce server power while accelerating applications by using Intel Optane PMem and intelligently managing where the data is located and its movements,” Peña said. “We can take advantage of the large memory sizes that the new technology offers and put more data close to the processor using considerably less energy. There is a slightly longer latency than DRAM, but we don’t have to pay for the penalty of even more latency going to other storage technologies.”

Innovative Data Profiling and Memory Allocation Tools for Intelligent Data Management

To enable his approach with heterogeneous memories, Peña and his team have created several software tools using Extrae, a general-purpose profiler developed by BSC, Intel vTuneTM profiler, and Extended Valgrind for Object Differentiated Profiling (EVOP), among others. EVOP was first developed by Peña at ANL and is now maintained at BSC. Their tools first perform what Peña calls data-oriented profiling by running the profiling tools while the application executes normally. The tools analyze the demand and latencies for different objects and create a large file listing all data accesses.

“Knowing how each data object is accessed during execution helps us decide in the optimization step where those have to be allocated in the different memories,” Peña described. “In a simplified view, we associate metrics with the different data objects. Then we count the number of accesses or the number of last level cache misses for each object. From this, we can apply different algorithms for memory allocations to maximize the performance.”

  • Caption: Members of the BSC research team handling a heterogeneous memory platform from the EPEEC project

 

Case study

High Performance Computing (HPC)

2nd Generation Intel® Xeon® Scalable Processors

Intel® Optane™ Persistent Memory