SORS: Genomics workloads for the future (now)

Data: 07/Nov/2024 Time: 12:00

Place:

[HYBRID] Sala d'Actes, FiB and Online via Zoom

Primary tabs

Abstract

Pjotr Prins - (UTHSC)

Short Bio

Pjotr is a bioinformatician at large & associate (coding) professor and runs a research lab at the Department of Genetics, Genomics and Informatics at the University of Tennessee Health Science Center. Pjotr is also director of Genenetwork.org; an Oxford University Nuffield Department of Medicine honorary visiting research fellow in statistical genetics, and a former visiting research fellow of the Personal genomics and bioinformatics department of the University Medical Center Utrecht and the Groningen Bioinformatics Center. As of 2021 Pjotr has been appointed adjunct associate professor at Pwani University, Kenya, and is a member of the ERIBA Laboratory of Genome Structure and Ageing at UMC Groningen, the Netherlands. Pjotr is a founder, editor and coder for the BioHackrXiv.org and a former editor for the Journal of Open Source Software (JOSS). Pjotr received his PhD at the Laboratory of Nematology, Wageningen University, the Netherlands.

Title: Genomics workloads for the future (now)

Abstract: In this talk, Pjotr and Arun will talk about pangenomic challenges and scaling up with HPC. Genomics workloads have always been announced and fail to make a dent on compute clusters. We predict this may now change with the introduction of pangenomes and sub $100 whole genome sequencing that is around the corner.


Arun Isaac - (UCL London)

Short bio

Arun Isaac is a postdoc with Prof. Richard Mott at the Department of Genetics, Evolution & Environment in University College London. Earlier, he completed a PhD in Computational Science from the Indian Institute of Science, Bengaluru. Arun is an avid programmer with a particular interest in lisp.
 

Ttitle: Ravanan---a high performance Common Workflow Language implementation

Abstract: We will describe ccwl and ravanan---Common Workflow Language (CWL) tools we have developed---that allow running scientific workflows on HPC clusters with concision, reproducibility and performance. ravanan uses propagator networks to provide high cluster utilization. Intermediate computational steps are run as early as possible without blocking, even when only partial results are available from previous steps. In addition, ravanan provides strong caching and reproducibility guarantees so that computation never has to be repeated.

Speakers

Speakers: Pjotr Prins (UTHSC), Arun Isaac (UCL London)
Host: Santiago Marco Sola. Associate Researcher. High Performance Domain-Specific Architectures - Computer Sciences, BSC