BSC uses bioinformatics, artificial intelligence and the computing power of the MareNostrum supercomputer in the fight against the coronavirus

27 March 2020

Information updated October 27: The BSC conducts research in the genome of the virus and participates in the research for vaccines and drugs to combat it.

BSC stores and analyzes clinical data of COVID-19 patients for the creation of tools that assist clinicians in the diagnosis and treatment of the disease

The centre uses artificial intelligence, the natural language processing and big data techniques to analyse the spread and the social impact of the pandemic.

The MareNostrum 4 supercomputer is available to the needs of the scientific community fighting coronavirus.
 

Barcelona Supercomputing Center (BSC) collaborates in the fight against the coronavirus from different areas: the application of bioinformatics for the research on the virus and its possible treatments, the use of artificial intelligence and natural language processing and big data techniques to analyse the data about the spread and impact of the pandemic and the use of the MareNostrum 4 supercomputer to enable the fight against the coronavirus.

1. Bioinformatics to search for treatments

From the bioinformatics side, the BSC is an example of how bioinformatics and supercomputers are nowadays an indispensable tool for research centres that have experimental laboratories to accelerate the fight against the coronavirus. Bioinformatics is used for research on the virus and its possible treatments, analysing the coronavirus genome and its successive mutations, and searching for drugs and immune therapies (antibodies and vaccines).

Genomics

Understanding how the virus has evolved through different epidemics (such as the SARS epidemic in 2003, MERS in 2012, or the current Covid-19) is important because it allows us to understand how it is possible for the virus to pass from one species to another and what changes it has to undergo to make possible this transmission. It sheds light on the virus mode of transmission and the mechanisms it uses to interact with our inmune system and the inmune system of other species. This is crucial when looking for treatments and for the prevention and prediction of eventual future outbreaks.

This study is carried out on data available in public databases that house genomic sequences of the different virus mutations and animal species. The information is analysed with computer programs specifically designed for it, some developed in the BSC itself and others by other teams. The processing of these data requires great computational capacity and therefore the high-performance computing resources of MareNostrum 4 supercomputer, hosted and managed by Barcelona Supercomputing Center, are used.

Search for treatments

Another important aspect is the search for treatments against the diseases caused by the coronavirus, including simulations that reproduce in silico the possible routes that can be exploited to attack this virus.

This process is known in research as "docking" and consist in simulating in the computer the interactions between the virus and the molecules that could be used to make vaccines, antibody treatments or drug treatments.

To carry out this process, the researchers use the knowledge generated in the research of the virus genome, information on the structures of its virus proteins and data on drugs and other inorganic molecules, which are stored in computer libraries that contain millions of chemical compounds and the results obtained in previous experiments, collected over years by the scientific community.

Computer search or drug screening is very helpful in speeding up the process of finding and validating disease treatments and vaccines, as it greatly cuts the time and investment required for the first phase of this research. Any treatment or vaccine that computer models predict may be successful must subsequently be validated in experimental laboratories, animal testing, and clinical research, and refined in constant collaboration between different research participants.

To carry out this work, researchers at the BSC use different computer programs, including the PELE molecular interaction modelling software developed at BSC. This software and the power of the MareNostrum 4 supercomputer enable thousands of computational experiments to be performed optimizing the binding of drugs and proteins in a fast and effective way.

At the BSC, research on the virus and its possible treatments are carried out in close collaboration between the groups of Alfonso Valencia, ICREA researcher, director of the BSC Life Sciences Department and leader of the Computational biology group, Víctor Guallar, also ICREA researcher, head of the Electronic and atomic protein modeling team and maximum promoter of the PELE software and Toni Gabaldón, ICREA researcher and head of the Comparative genomics group. All of them work in cooperation with the BSC operations team, who are in charge of providing them with the computational resources needed.

  • 1.1 Projects related to the search for drugs and vaccines

Currently there are three projects that channel the research carried out at BSC on the coronavirus and its possible treatments: EXSCALATE4CoV (E4C), funded by the European Commission under the H2020 program, a collaborative project with the centres of research IrsiCaixa and CreSa-IRTA, with the support of Grifols; and a project in collaboration with the Institute of Advanced Chemistry of Catalonia (IQA) and Nostrum Biodiscovery (NBD), funded by the COVID-19 fund of the Instituto de Salud Carlos III.

E4C specially emphasizes on basic and applied research to search for drugs. The collaboration with IrsiCaixa and CreSa-IRTA is more focused on the search for immunological therapies supported by genomic research and bioinformatics tools, and the project with IQA and NBD focuses on the search for antiretroviral that can inhibit the coronavirus causing COVID-19 and other subsequent coronaviruses.

  • 1.2 Participation in collective initiatives to generate more biomedical knowledge about the disease

A) Researchers of the Life Sciences department participate in the COVID-19 Disease Map, a platform to bring together and classify the scientific information related to the virus. It is an effort of researchers from 25 countries to bring together and organize the knowledge generated so far on the molecular map of viruses and the mechanisms of interaction between the SARS-CoV-2 virus and the host, guided by contributions from experts in the domain and based on published work. Its objective is to organize the available information to facilitate a better understanding of the disease and help in the development of efficient diagnoses and therapies.

This repository was released through the journal Nature website and is an open collaboration between clinical researchers, life scientists, computational biologists, and data scientists, essential to be able to simulate virus behavior at the molecular level in computers and thus contribute to the search for vaccines and treatments.

B) Coordination of the new European center of excellence for HPC and personalized medicine PerMedCoE

The European Commission has approved the BSC initiative to coordinate a Center of Excellence for HPC Applications in personalized medicine (PerMedCoE). The objective of this center is to combine in an agile way computational models of biochemical and cellular processes with experimental validations.

In the case of research on SARS-CoV-2, BSC will be in charge of creating simulations that reproduce the behavior of human cells in their interaction with the virus, based, among others, on the knowledge organized in the COVID-19 Disease Map.

The CSC supercomputing center in Finland, the Royal Swedish Institute of Technology, the IBM Research center in Zurich (Switzerland), the European Molecular Biology Laboratory (EMBL), the Center for Genomic Regulation of Barcelona also participate in this center of excellence , the Curie Institute (France), the Heidelberg University Hospital (Germany), the Max Delbrück molecular medicine center (Germany), the University of Luxembourg, the University of Lubliana (Slovenia) and the companies ATOS Spain and Elem Biotech (spin off of BSC)

2. Creation of tools that assist clinicians in the diagnosis and treatment of the disease

The evolution of the pandemic is generating the production of abundant clinical material from healthcare to COVID-19 patients. The correct storage and analysis of this data will make it possible to generate knowledge that can be of great help in understanding the disease and the activity of healthcare professionals.

BSC collaborates with hospitals, health centers and medical specialties in the collection, storage and arrangement of this data (mainly images and medical records) and in the creation of tools to search for patterns that may be useful in making diagnoses and assisting clinicians in treatment decisions.

Likewise, measures and procedures are being studied to ensure that these predictive models include the impact of sex and gender on the disease.

  •  Creation of predictive models of disease evolution in Covid-19 patients based on clinical texts and AI, in collaboration with Hospital Clínic, Hospital Universitario 12 de Octubre and Hospital Virgen del Rocío.

    The Department of Life Sciences is working on the creation of predictive models based on Artificial Intelligence to predict the progression of the disease in COVID 19 patients. To develop these tools, the information contained in the clinical reports of the thousands of Covid 19 patients who have visited these hospitals is used as a basis. The work consists of standardizing the information contained in the clinical documents (demographic information, tests, laboratory and diagnostic imaging results, treatments and clinical reports of each patient at different times of their disease for each patient) and training a model based on in AI (deep learning neural networks), which will look for common patterns and generate predictions about the evolution of new patients.

    The collaboration with the Hospital Clínic is already fully active, and the collaborations with the 12-O and Virgen del Rocío hospitals are pending formal details. The project is open to the incorporation of new hospitals.

    From April to the present, the team in charge of the project (Text Mining) has carried out concept tests of predictive models with the data of 2,000 patients that the HM Hospitales chain opened to the scientific community. These models have given very promising results and we are waiting to receive the data from the collaborating hospitals in the project in ISARIC format, in order to be able to validate them with larger sets. In total, it is expected to collect data from more than 15,000 patients.

    The initiative is developed within the framework of the Language Technologies Plan promoted by the Secretariat for Digitization and Artificial Intelligence of the Ministry of Economy.
     

  • In a similar vein, the High Performance Artificial Intelligence group participates in the CIBERES-UCI-COVID project, led by the Hospital Clínico and funded by the Carlos III Health Institute. The objective of this study is to use the data of COVID patients who have had to be admitted to ICUs, to identify risk factors and help clinicians to make their prognoses. So far it has collected the clinical reports of 2,000 patients and the data analysis is about to begin.
  •  Researchers from the Department of Life Sciences linked to the National Institute of Bioinformatics participate in the common data platform on COVID-19 promoted by the European Union to ensure a rapid and coordinated response to the health crisis caused by COVID-19. The common data platform aims to aggregate and share all data generated from coronavirus research to accelerate the development of solutions to the virus and disease. Omics data (those from disciplines such as genomics, proteomics, metabolomics), sequencing, clinical and epidemiological data will be included. Spanish collaboration in this project is carried out through the Carlos III Health Institute (ISCIII) and the National Institute of Bioinformatics (INB), led by the BSC. This platform is one of the initiatives promoted within the framework of the ERAvsCORONA ’action plan launched by the European Commission to support research, coordinate efforts and seek synergies in the field of research and innovation.
  •  BSC hosts and analyses lung tomographies of COVID-19 patients from the European artificial intelligence platform IA4EU. This action is part of the REACT project, which aims to make available to the European artificial intelligence community a minimum of 40,000 TACS (made up of 5,000 images each) to drive the creation of artificial intelligence algorithms to assist clinicians in diagnosing whether a patient is affected by COVID-19, to determine the severity of the case and predict its evolution. The project is carried out in collaboration with professional radiology associations and BSC will exercise the technical coordination of the different solutions proposed.
  • The High performance computational mechanics group is using their Alya Red heart simulator to study the possible effects of treatments used against COVID-19 on the cardiovascular system. Specifically it is being studied a) the effect of antimalarial drugs on various human hearts with a variety of comorbidities that may be present in the infected population and b) the complex hemodynamics associated with north-south syndrome, in relation to venous-arterial extracorporeal membrane oxygenation therapy in patients with deep respiratory failure. The investigations are carried out in collaboration with the spin-off of the BSC Elem Biotech.

3. Artificial intelligence to analyse the spread and social impact of the pandemic

  • The Computational Biology group of the Department of Life Sciences is working on the development of a publicly accessible geographic information system on the expansion of Covid 19 outbreaks, which integrates different data sources from public administrations, to help the analysis of the expansion of the pandemic and decision-making related to the management of new outbreaks of COVID-19.

    The platform gathers data on health, mobility and geolocation from the Ministry of Health, the Ministry of Transport, Mobility and Urban Agenda, the National Institute of Statistics, the Carlos III Health Institute, the Catalan and Basque health agencies, among others.

    The analysis of the data is carried out with network systems that seek the relationships between them, with the aim of obtaining a better understanding of the spread of the disease. With the results, reports are made for health authorities and tools are developed to nurture epidemiological models intended to support decision-making. The project is carried out at the request of the Carlos III Health Institute.
     

  • The same group participates in the creation of the Epidemiological Observatory of Catalonia, which will use Big Data and Artificial Intelligence techniques to generate a new collection of epidemiological models that help public health institutions to prevent, detect early and mitigate the spread of epidemics.

    This initiative combines the efforts of the Generalitat de Catalunya, medical and health institutions (Hospital Germans Trias i Pujol and Fundación Lucha contra el Sida), leading technological research centers (BSC, CIDA, Eurecat, URV and CSIC), mobile phone operators (Telefónica, Orange and GSMA) and the Mobile World Capital Barcelona.

    The BSC's task will be to collaborate in the development of a pandemic model for future prevention, including all data sources, and also in data storage, computing, health data management, and meteorological data computing.
     

  • The BSC Data pre and post processing group, the Hospital Clínic and the UPF Center for Research in Economics and Health are developing predictive models of bed occupancy in health centers. The models are based on machine learning and offer occupancy predictions at one week's view, both for hospital beds and related centers (eg extraordinary facilities to deal with COVID-19). The objective of this tool is to plan health logistics, both in acute phases of the pandemic and in the return to hospitals of patients affected by other diseases. They are currently being used by the Hospital Clínic, the Sant Joan de Déu Hospital, the Bellvitge Hospital, the Olot Hospital and the Baix Empordà Integrated Health Area.
     
  •  Researchers from the Applications for Science and Engineering department have collaborated with the adaptation of the infectious diseases simulator Epigraph ', to the characteristics of COVID-19 to simulate the spread of the current pandemic in Spain. The simulator was developed at the Carlos III University of Madrid by Professor David Singh and researcher Cristina Marinescu, current researcher on the Smart Cities team at BSC. Both have made their update to COVID-19, with the supervision of the National Center for Epidemiology (CNE) and the Consortium Center for Biomedical Research Network (CIBER) and it’s being enriched with funds from the programme promoted by the Carlos III Health Institute. The simulator unites four models, each one fed with different data: epidemiological, social interactions, mobility through transport and meteorological. The tool aims to assist health professionals in decision-making related to the pandemic as it creates scenarios for the spread of the virus at different levels of mobility restrictions, in the face of selective vaccination scenarios and others.
     
  •  Researchers from the Departments of Computational Applications for Science and Engineering and Computer Sciences have assisted the Secretary of State for Digitization and Artificial Intelligence in the development of the coronavirus contact tracing application that was launched in the La Gomera island in July.

    The BSC has collaborated with academic studies and counselling on a) the interoperability of different tracking applications, b) the modeling of risk scoring algorithms, c) the functional design of the infrastructure and d) recommendations regarding the system architecture.

    The counselling was made after months of investigating the protocols and characteristics of the tracing applications that different European governments were implementing, as well as aspects related to privacy, interoperability, backend, and measures to weigh the real risk of contagion through of parameters related to electronic contact.

    BSC plans to collaborate with the Generalitat de Catalunya in the development of its version of the tracking application.
     

  • The Earth Sciences department is conducting several studies to quantify how confinement measures have affected air quality in Spain. The studies are carried out following different methodologies, among them models based on machine learning show the real relationship that exists between reducing emissions and improving air quality, taking into account the possible effects that meteorology could have had in this last parameter.

    The innovations contained in this methodology have been the origin of new BSC collaborations with the European Centre for Medium-Range Weather Forecasts (ECMWF), the entity in charge of the Atmospheric Monitoring Service of the European Union Copernicus program.

    Its results are being used in different applications, such as Copernicus' emission and quality modeling, including the European information service on air quality in support of the COVID-19 crisis and the activities carried out by this entity to provide support to users of European policies on air quality issues.

    They are also used in IS Global studies that will analyze the effect that reducing emissions has had on the health of citizens.
     

  •  The same department has developed a heat wave alert service weeks in advance on a European scale for vulnerable populations, within the 2H24E Climate Services for Clean Energy project. One of the objectives is to help the general population and energy providers to prepare for extreme heat events.
     
  •  BSC’s High Performance Artificial Intelligence (HPAI) research group collaborates with UNICEF on a project that aims to analyse the socioeconomic impact of the virus locally and globally, with an emphasis on social distancing. The goal is to find impact indicators, patterns and statistics that serve the UN and local authorities to take better and faster measurements.
     
  •  The same team of BSC’s artificial intelligence experts collaborate with Mexican researchers and other researchers at the centre in the creation of a data collection and analysis system to assist in decision-making to deal with COVID-19.The project is carried out in collaboration with Mexico City, Nuevo León and Jalisco: http://dash.covid19.geoint.mx/
     
  • The BSC user support service has assisted the Government of Spain and the Generalitat in screening public purchase contracts to locate medical equipment (especially respirators) acquired by centres that are not dedicated to health care and that may be susceptible of being transferred to health centers. In total, more than 850,000 documents have been screened using big data technologies and the service is still open.
     
  • The Social Link Analytics group coordinates an initiative to carry out an analysis of the tweets issued in Spain related to COVID-19. There is a daily monitoring of the evolution of citizens' sentiment in relation to the evolution of the pandemic.
     
  • Researchers from the Social Link Analytics group collaborate on a project that investigates the dynamics of spreading false news about health on social networks in Spain. The objective is to understand the mechanisms of its dissemination, in order to develop and disseminate guidelines that serve to counteract this phenomenon. The study is led by the University of Navarra and has been selected to obtain funding from the BBVA Foundation Program of Aid to Scientific Research Teams 2019.

MareNostrum 4 and users support

The MareNostrum 4 supercomputer provides the necessary computing power to accelerate ongoing investigations against the coronavirus.

The BSC Operations department provides support in the use of MareNostrum 4, both to internal and external researchers. During the last 6 months, among all the listed projects, more than 39.5 million computing hours have been consumed in research related to the fight against the coronavirus.

External users have been able to use the BSC infrastructures through the allocation of calculation hours reserved for the BSC or through the calls for researchers that are made through the Spanish Supercomputing Network (RES) or the European network PRACE.

PRACE opened an extraordinary call for COVID-19 investigations in March.

Through this call, MareNostrum was accessed by the project of the group of the professor of chemistry and physics at the University of Valencia, Iñaki Tuñón, who carried out a project to simulate the chemical reactivity of the SARS-COV-2 protease and obtain information that helps to design drugs that serve to inhibit it and, therefore, prevent the virus from replicating.

This team used for two and a half months (from April to June) more than 12,000 processors (7.3% of MareNostrum's capacity). In total, it has used 23.4 million processor hours to perform more than 400 simulations.

• The Spanish Supercomputing Network (RES) has reserved 50% of the resources in its last two calls, 2020-2 and 2020-3, (which includes 20% of the MareNostrum 4 supercomputer) for research related to the pandemic.

Through the 2020-2 call, which was evaluated in June, the following projects related to the fight against SARS-Cov-2 have obtained access to the BSC's supercomputing infrastructures:

 

Title

Leader Name

Leader Institution

Khours

Exploring COVID19 Infectious Mechanisms and Host Selection Process

Modesto Orozco

Institut de Recerca Biomèdica (IRB)

4839

In silico toxicology prediction for compounds binding to the SARS-CoV-2 protease

Victor Guallar Tasies

Barcelona Supercomputing Center (BSC-CNS)

3600

Structural analysis by cryo EM of SARS Cov-2 Spike in complex with human neutralizing antibodies

José María Carazo

Centro Nacional de Biotecnología

140

Simulating COVID-19 propagation at a European-level

David Expósito Singh

Universidad Carlos III de Madrid (UC3M)

9300

Searching for small compounds as stabilizers of the inactive spike protein in SARS-COV-2

F. Javier Luque

Universitat de Barcelona

3367

Identification and Design of drugs for SARS-CoV2 nsp1 and the nsp1:40S ribosome complex

Francesco Luigi Gervasio

University College London

4942

Phenotypic targeting of COVID-19 spike protein ACE2 interface for safe drug delivery

Giuseppe Battaglia

Institute for Bioengineering of Catalonia (IBEC)

3664,56

Molecular dynamics simulations of the interaction between the SARS-Cov-2 virus and surfaces of different materials

Jordi Faraudo

Institut de Ciencia de Materials de Barcelona, ICMAB-CSIC

300

Droplet characterization of coughing and breathing

Pedro Martí Gómez-Aldaraví

Universitat Politècnica de València

135

Performance evaluation of Individual protection equipment using High Fidelity Simulations

Antonio Gil Megías

Universitat Politècnica de València

135

Semiconductor oxide surface applications: catalytic, sensor and biological evaluation

Juan Andrés Bort

Universitat Jaume I (UJI)

167

MultiScale Simulations of the Activity of 3CL Protease of SARS-CoV-2

Iñaki Tuñón

Universitat de València

8400

Revealing the molecular mechanisms of catalysis and inhibition of SARS-CoV-2 Mpro: towards the design of a COVID-19 antiviral drug.

Vicent Moliner

Universitat Jaume I (UJI)

381,3

Evaluation of the pH-dependence of the SARS-Cov-2 main protease by molecular dynamics simulations.

Sergio Madurga Diez

Universitat de Barcelona

133

Computational chemistry from static and dynamic approaches to block the protease of COVID

Albert Poater

Universitat de Girona

2000

Ultra-wide screening of ligand binding targets locking the SARS-cov-2 Glycoprotein S in the down conformation (COVID-LOCK)

Ivan Coluzza

CICbiomaGUNE

2200

 
In the 2020-3 call, which was evaluated in October, 11 research activities related to the fight against the coronavirus were assigned. These 11 activities were approved with an allocation of 35 Million hours of computation.

 

Title

Leader Name

Institution

Khours

Computer Design of Inhibitors of SARS-CoV-2 Mpro by QM/MM Simulations: Towards the Design of Efficient COVID-19 Antiviral Drugs

Vicent Moliner

Universitat Jaume I (UJI)

1108

Droplet characterization of coughing and breathing

Pedro Martí

Universitat Politècnica de València

130

Exploring Covid19 Infectious Mechanisms and Host Selection Process

Modesto Orozco López

Institut de Recerca Biomèdica (IRB)

7000

Exploring peptide/MHC dissociation landscapes using Hierarchical Natural Move Monte Carlo

Jordi Villà Freixa

Universitat Internacional de Catalunya

700

High-throughput model exploration of multi-scale simulations of SARS-CoV-2 infection

Alfonso Valencia

Barcelona Supercomputing Center (BSC-CNS)

5000

Identification and Design of drugs for SARS-CoV2 nsp1 and the nsp1:40S ribosome complex

Francesco Luigi Gervasio

University College London

500

In silico toxicology prediction for compounds binding to the SARS-CoV-2 protease (continued)

Victor GUALLAR

Barcelona Supercomputing Center (BSC-CNS)

3600

MultiScale Simulations of the Activity of 3CL Protease of SARS-CoV-2

Iñaki Tuñón

Universitat de València

8500

Numerical investigation of turbulent dispersion of infectious aerosol clouds generated by sneezes and other violent respiratory events.

Alexandre Fabregat Tomas

Universitat Rovira i Virgili

3000

Searching for small compounds as stabilizers of the inactive spike protein in SARS-COV-2

Fco. Javier Luque Garriga

Universitat de Barcelona

2516

Turbulent dispersion and surface deposition of pathogen-laden droplets in enclosed rooms.

Alexandre Fabregat Tomas

Universitat Rovira i Virgili

3000

In addition to those mentioned, the following projects also use the BSC facilities:

  • Hosting of a chest X-ray medical imaging data set (BIMCV-COVID-19) to collaborate in the development of open source artificial intelligence tools to aid the early detection of COVID-19 infection and pneumonia. The data set has been prepared by the Joint Unit of Biomedical Imaging FISABIO-CIPF with data from different hospitals affiliated to the Medical Image Bank of the Valencian Community (BIMCV).
  • Molecular dynamics and sequence design simulations for the optimization of antibodies against SARS-COV-2. Collaboration between researchers from the Physics-Chemistry Department of the University of Barcelona at the National Center for Biotechnology and the University of Edinburgh.