BSC nominated for Best Paper Award at SC15

16 September 2015

The paper is entitled Exploiting Asynchrony from Exact Forward Recovery for DUE in Iterative Solvers.

The paper entitled Exploiting Asynchrony from Exact Forward Recovery for DUE in Iterative Solvers by Luc Jaulmes, Marc Casas, Miquel Moretó, Eduard Ayguadé, Jesús Labarta and Mateo Valero is nominated for the Best Paper Award at SC15.

About the paper:
This paper presents a method to protect iterative solvers from Detected and Uncorrected Errors (DUE) relying on error detection techniques already available in commodity hardware. Detection operates at the memory page level, which enables the use of simple algorithmic redundancies to correct errors. Such redundancies would be inapplicable under coarse grain error detection, but they become very powerful when the hardware is able to precisely detect errors.Relations straightforwardly extracted from the solver allow to recover lost data exactly. This method is free of the overheads of backwards recoveries like checkpointing, and does not compromise mathematical convergence properties of the solver as restarting would do. We apply this recovery to three widely used Krylov subspace methods: CG, GMRES and BiCGStab.We implement and evaluate our resilience techniques on CG, showing very low overheads compared to state-of-the-art solutions. Overlapping recoveries with normal work of the algorithm decreases overheads further.

Each year, the SC Technical Papers Committee identifies one paper as the best paper from the Conference's Technical Program.

For more information, visit the SC15 webpage
You can find Best Paper and Best Student Paper Finalists here.