The survival machine and surviving the machines by Jesús Labarta

25 June 2015

Have you ever listened to a neuroscientist talking about the human brain? It is absolutely fascinating. I will always remember a talk in which the brain was referred to as a “survival machine,” or SM. The idea is that the SM reacts properly to and hence survives in changing environments. But how is it supposed to know how to react? It makes use of predictive models that are based on data assimilation.

Survival of the most adaptable

Thinking about this capability of adapting to changing environments, I couldn’t help but think about HPC – or more specific about the relation between programming models and their interaction with changing hardware environments: In HPC we tend to be extremely confident about our mental models of how a certain computer and program will behave. But this often does not stand the reality check. In fact, there is usually a huge divergence between our mental model and the actual application behavior on a given system. Hitting the “power wall” made us not only go multicore but also broke with what in the past was a clean interface (the ISA) between architecture and software. On our way to an Exascale era, systems will become even more heterogeneous and extremely variable. The only way to cope with such a “hostile” environment is by developing programming models and runtimes capable of adapting dynamically to the underlying hardware environment.

Breaking with outdated approaches to programming environments

In this context, we advocate for decoupling again the programs from the machine. Programs should be written documents to convey algorithms and ideas between humans rather than thinking of them as ways to tell machines what to do or how. I firmly believe that we need to rely on runtime systems, and see them as survival machines. They can be capable of dynamically adapting fairly abstract computational and data access requirements of the application as expressed in the program, to the available resource in the platform at hand.

A programming model should provide the programmer with an interface by which they can express their computational and data access requirements in an abstract way. A mechanism to convey additional information may be conveyed so that it is possible for the runtime to efficiently exploit the system. But this should always be in the form of abstract, high level hints and can never be a requirement for a functional description of the ideas. The challenge for a programming model is to provide such capabilities in a way that can be perceived as incremental by current programmers, even if this is a long-term underlying cultural shift in the way programs are written.

The way forward: the OmpSs programming model

We consider that our work on the OmpSs programming model and NANOS runtime that supports it provides practical evidence that these types of ideas are viable and can be deployed in an incremental way. The fundamental concepts in OmpSs are tasks identified as pragma annotations in front of regular sequential code and directionality clauses that describe the data access by the tasks. From such information, the runtime is able to detect dependences and potential concurrency between tasks as well as perform the data movements that might be required on a specific underlying hardware. 

OmpSs within the DEEP project

Within the DEEP project we have extended the original OmpSs environment with MPI offloading capabilities. In it, one or several processes in an MPI application can collectively spawn other MPI computations that can be dynamically mapped to different parts of a heterogeneous cluster based on appropriateness of the cores or the interconnect, for example.

The approach has shown extremely useful not only to offload computations as initially required by the DEEP project objectives, but also to do multilevel nested spawns or reverse offloading back to the original nodes with minimal and concise annotations. Offloading I/O tasks has proven to be an interesting functionality where some of the computational nodes do not have such capability. In many cases, this I/O offloading resulted in a transparent overlap of computation and I/O without any explicit programming effort. It is interesting to see how many unexpected execution orders or overlaps do show up in OmpSs codes that are beneficial for performance. Previous explicit parallelization efforts never dared to explore these opportunities because they implied a complex code reworking or simply the programmer never thought about them.  

I understand that people might be hesitant to hand over the control they think they have to an autonomous run-time. But ongoing evidence shows huge potentials. Our vision is that a clean program has to focus only on specifying computational and data access requirements. This specification will then be mapped to the hardware by a dynamic, adaptive and otherwise intelligent runtime, as this is the only way to survive the diversity, heterogeneity and variability of machines of the future.

Come meet us at the joint European Exascale Projects booth #634. We are looking forward to lively discussions with you on this and further topics.

You are interested in learning more about research on Exascale in Europe? Then register for our workshop: “Is Europe Ready For Exascale? A Summary of Four Years of European Exascale Research” held on Thursday, July 16 2015. For more information, please visit http://bit.ly/1Clnlg4

About the blogger

Jesus Labarta is full professor on Computer Architecture at the Technical University of Catalonia (UPC) since 1990. Since 1981 he has been lecturing on computer architecture, operating systems, computer networks and performance evaluation. His research interest has been centred on parallel computing, covering areas from multiprocessor architecture, memory hierarchy, programming models, parallelizing compilers, operating systems, parallelization of numerical kernels, performance analysis and prediction tools. Since 2005 he is responsible of the Computer Science Research Department within the Barcelona Supercomputing Center (BSC). He has been involved in research cooperation with many leading companies on HPC related topics. His major directions of current work relate to performance analysis tools, programming models and resource management. His team distributes the Open Source BSC tools (Paraver and Dimemas) and performs research on increasing the intelligence embedded in the performance analysis tools. He is involved in the development of the OmpSs programming model and its different implementations for SMP, GPUs and cluster platforms. He has been involved in Exascale activities such as IESP and EESI where he has been responsible of the Runtime and Programming model sections of the respective Roadmaps. He leads the programming models and resource management activities in the HPC subproject of the Human Brain Project.