BSC uses bioinformatics, artificial intelligence and the computing power of the MareNostrum supercomputer in the fight against the coronavirus

27 March 2020

Information updated June 4: The BSC conducts research in the genome of the virus and participates in the research for vaccines and drugs to combat it.

BSC stores and analyzes clinical data of COVID-19 patients for the creation of tools that assist clinicians in the diagnosis and treatment of the disease

The centre uses artificial intelligence, the natural language processing and big data techniques to analyse the spread and the social impact of the pandemic.

The MareNostrum 4 supercomputer is available to the needs of the scientific community fighting coronavirus.
 

Barcelona Supercomputing Center (BSC) collaborates in the fight against the coronavirus from different areas: the application of bioinformatics for the research on the virus and its possible treatments, the use of artificial intelligence and natural language processing and big data techniques to analyse the data about the spread and impact of the pandemic and the use of the MareNostrum 4 supercomputer to enable the fight against the coronavirus.

Bioinformatics to search for treatments

From the bioinformatics side, the BSC is an example of how bioinformatics and supercomputers are nowadays an indispensable tool for research centres that have experimental laboratories to accelerate the fight against the coronavirus. Bioinformatics is used for research on the virus and its possible treatments, analysing the coronavirus genome and its successive mutations, and searching for drugs and immune therapies (antibodies and vaccines).

Genomics

Understanding how the virus has evolved through different epidemics (such as the SARS epidemic in 2003, MERS in 2012, or the current Covid-19) is important because it allows us to understand how it is possible for the virus to pass from one species to another and what changes it has to undergo to make possible this transmission. It sheds light on the virus mode of transmission and the mechanisms it uses to interact with our inmune system and the inmune system of other species. This is crucial when looking for treatments and for the prevention and prediction of eventual future outbreaks.

This study is carried out on data available in public databases that house genomic sequences of the different virus mutations and animal species. The information is analysed with computer programs specifically designed for it, some developed in the BSC itself and others by other teams. The processing of these data requires great computational capacity and therefore the high-performance computing resources of MareNostrum 4 supercomputer, hosted and managed by Barcelona Supercomputing Center, are used.

Search for treatments

Another important aspect is the search for treatments against the diseases caused by the coronavirus, including simulations that reproduce in silico the possible routes that can be exploited to attack this virus.

This process is known in research as "docking" and consist in simulating in the computer the interactions between the virus and the molecules that could be used to make vaccines, antibody treatments or drug treatments.

To carry out this process, the researchers use the knowledge generated in the research of the virus genome, information on the structures of its virus proteins and data on drugs and other inorganic molecules, which are stored in computer libraries that contain millions of chemical compounds and the results obtained in previous experiments, collected over years by the scientific community.

Computer search or drug screening is very helpful in speeding up the process of finding and validating disease treatments and vaccines, as it greatly cuts the time and investment required for the first phase of this research. Any treatment or vaccine that computer models predict may be successful must subsequently be validated in experimental laboratories, animal testing, and clinical research, and refined in constant collaboration between different research participants.

To carry out this work, researchers at the BSC use different computer programs, including the PELE molecular interaction modelling software developed at BSC. This software and the power of the MareNostrum 4 supercomputer enable thousands of computational experiments to be performed optimizing the binding of drugs and proteins in a fast and effective way.

At the BSC, research on the virus and its possible treatments are carried out in close collaboration between the groups of Alfonso Valencia, ICREA researcher, director of the BSC Life Sciences Department and leader of the Computational biology group, Víctor Guallar, also ICREA researcher, head of the Electronic and atomic protein modeling team and maximum promoter of the PELE software and Toni Gabaldón, ICREA researcher and head of the Comparative genomics group. All of them work in cooperation with the BSC operations team, who are in charge of providing them with the computational resources needed.

Currently there are three projects that channel the research carried out at BSC on the coronavirus and its possible treatments: EXSCALATE4CoV (E4C), funded by the European Commission under the H2020 program, a collaborative project with the centres of research IrsiCaixa and CreSa-IRTA, with the support of Grifols; and a project in collaboration with the Institute of Advanced Chemistry of Catalonia (IQA) and Nostrum Biodiscovery (NBD), funded by the COVID-19 fund of the Instituto de Salud Carlos III.

E4C specially emphasizes on basic and applied research to search for drugs. The collaboration with IrsiCaixa and CreSa-IRTA is more focused on the search for immunological therapies supported by genomic research and bioinformatics tools, and the project with IQA and NBD focuses on the search for antiretroviral that can inhibit the coronavirus causing COVID-19 and other subsequent coronaviruses.

Researchers of the Life Sciences department participate in the COVID-19 Disease Map, a platform to bring together and classify the scientific information related to the virus. It is an effort of researchers from 25 countries to bring together and organize the knowledge generated so far on the molecular map of viruses and the mechanisms of interaction between the SARS-CoV-2 virus and the host, guided by contributions from experts in the domain and based on published work. Its objective is to organize the available information to facilitate a better understanding of the disease and help in the development of efficient diagnoses and therapies.

This repository has been released through the journal Nature website and is an open collaboration between clinical researchers, life scientists, computational biologists, and data scientists, essential to be able to simulate virus behavior at the molecular level in computers and thus contribute to the search for vaccines and treatments.

Storage and analysis of clinical data of COVID-19 patients for the creation of tools that assist clinicians in the diagnosis and treatment of the disease

The evolution of the pandemic is generating the production of abundant clinical material from healthcare to COVID-19 patients. The correct storage and analysis of this data will make it possible to generate knowledge that can be of great help in understanding the disease and the activity of healthcare professionals.

BSC collaborates with hospitals, health centers and medical specialties in the collection, storage and arrangement of this data (mainly images and medical records) and in the creation of tools to search for patterns that may be useful in making diagnoses and assisting clinicians in treatment decisions.

• Researchers from the Department of Life Sciences linked to the National Institute of Bioinformatics participate in the creation of a common data platform on COVID-19 promoted by the European Union to ensure a rapid and coordinated response to the health crisis caused by COVID-19. The common data platform aims to aggregate and share all data generated from coronavirus research to accelerate the development of solutions to the virus and disease. Omics data (those from disciplines such as genomics, proteomics, metabolomics), sequencing, clinical and epidemiological data will be included. Spanish collaboration in this project is carried out through the Carlos III Health Institute (ISCIII) and the National Institute of Bioinformatics (INB), led by the BSC. This platform is one of the initiatives promoted within the framework of the ERAvsCORONA ’action plan launched by the European Commission to support research, coordinate efforts and seek synergies in the field of research and innovation.

The Text Mining Unit of the Department of Life Sciences uses text mining and deep learning technologies to analyze COVID-19 patient data and create tools that help clinical staff predict the prognosis of new patients. Work is currently underway to create a data collection model to help hospitals create mutually compatible databases of demographic information, tests, laboratory and diagnostic imaging results, treatments and clinical reports for each patient. at different times of each patient's illness. With all this data, artificial intelligence techniques will be used to help forecast the evolution of new cases. This is a project in collaboration with the 12 de Octubre hospitals in Madrid and the Hospital Clínic in Barcelona and the collaboration of more hospitals is expected in the future. The tools are being developed based on the database opened by the HM hospitals.
 

• In a similar way, the High Performance Artificial Intelligence group participates in the CIBERES-UCI-COVID project, led by the Hospital Clínic in Barcelona and funded by the Carlos III Health Institute. The objective of this study is to use the data from COVID patients who have had to be admitted to ICUs, to identify risk factors and help clinicians make their prognoses.

• BSC hosts and analyses lung tomographies of COVID-19 patients from the European artificial intelligence platform IA4EU. This action is part of the REACT project, which aims to make available to the European artificial intelligence community a minimum of 40,000 TACS (made up of 5,000 images each) to drive the creation of artificial intelligence algorithms to assist clinicians in diagnosing whether a patient is affected by COVID-19, to determine the severity of the case and predict its evolution. The project is carried out in collaboration with professional radiology associations and BSC will exercise the technical coordination of the different solutions proposed.

• The High performance computational mechanics group is using their Alya Red heart simulator to study the possible effects of treatments used against COVID-19 on the cardiovascular system. Specifically it is being studied a) the effect of antimalarial drugs on various human hearts with a variety of comorbidities that may be present in the infected population and b) the complex hemodynamics associated with north-south syndrome, in relation to venous-arterial extracorporeal membrane oxygenation therapy in patients with deep respiratory failure. The investigations are carried out in collaboration with the spin-off of the BSC Elem Biotech.

Artificial intelligence to analyse the spread and social impact of the pandemic

  • The BSC Data pre and post processing group, the Hospital Clínic and the UPF Center for Research in Economics and Health are developing predictive models of bed occupancy in health centers. The models are based on machine learning and offer occupancy predictions at one week's view, both for hospital beds and related centers (eg extraordinary facilities to deal with COVID-19). The objective of this tool is to plan health logistics, both in acute phases of the pandemic and in the return to hospitals of patients affected by other diseases. They are currently being used by the Hospital Clínic, the Sant Joan de Déu Hospital, the Bellvitge Hospital, the Olot Hospital and the Baix Empordà Integrated Health Area.
  • The Computational Biology group of the Life Sciences Department works on the development of reports and simulation tools to support decision-making for managing the pandemic. The data they collect refers to cases of COVID-19, mobility and geolocation, they come from different official sources (ministries of Transport, Health, National Statistics Institute, Department de Salut de la Generalitat, etc.) and are analyzed with systems of networks to seek relationships between them and gain a better understanding of the spread of the disease. With the results, reports are made for health authorities and tools are developed to nurture epidemiological models intended to support decision-making. The project is carried out at the request of the Carlos III Health Institute.
  • Researchers from the Applications for Science and Engineering department have collaborated with the adaptation of the infectious diseases simulator Epigraph ', to the characteristics of COVID-19 to simulate the spread of the current pandemic in Spain. The simulator was developed at the Carlos III University of Madrid by Professor David Singh and researcher Cristina Marinescu, current researcher on the Smart Cities team at BSC. Both have made their update to COVID-19, with the supervision of the National Center for Epidemiology (CNE) and the Consortium Center for Biomedical Research Network (CIBER) and it’s being enriched with funds from the programme promoted by the Carlos III Health Institute. The simulator unites four models, each one fed with different data: epidemiological, social interactions, mobility through transport and meteorological. The tool aims to assist health professionals in decision-making related to the pandemic as it creates scenarios for the spread of the virus at different levels of mobility restrictions, in the face of selective vaccination scenarios and others. 
  • Researchers from the Departments of Computer Sciences and Computational Applications for Science and Engineering are carrying out different studies on the contact tracing applications that various governments are considering launching to alert of possible COVID-19 infections during the stage of the pandemic that opens after confinement. The studies that are being carried out include aspects related to privacy, interoperability between different applications, the necessary backends and measures to weigh the real risk of contagion through different parameters related to electronic contact between citizens.
  • The Earth Sciences department is conducting studies on how confinement measures have affected air quality in Spain. The studies are carried out following different methodologies, among them models based on machine learning show the real relationship that exists between reducing emissions and improving air quality, taking into account the possible effects that meteorology could have had in this last parameter. These results will be used in IS Global studies that will analyse the effect that the reduction of emissions has had on the health of citizens.
  • BSC’s High Performance Artificial Intelligence (HPAI) research group collaborates with UNICEF on a project that aims to analyse the socioeconomic impact of the virus locally and globally, with an emphasis on social distancing. The goal is to find impact indicators, patterns and statistics that serve the UN and local authorities to take better and faster measurements.
  • The same team of BSC’s artificial intelligence experts collaborate with Mexican researchers and other researchers at the centre in the creation of a data collection and analysis system to assist in decision-making to deal with COVID-19.The project is carried out in collaboration with Mexico City, Nuevo León and Jalisco: http://dash.covid19.geoint.mx/
  • The BSC user support service has assisted the Government of Spain and the Generalitat in screening public purchase contracts to locate medical equipment (especially respirators) acquired by centres that are not dedicated to health care and that may be susceptible of being transferred to health centers. In total, more than 850,000 documents have been screened using big data technologies and the service is still open.
  • The Social Link Analytics group coordinates an initiative to carry out an analysis of the tweets issued in Spain related to COVID-19. There is a daily monitoring of the evolution of citizens' sentiment in relation to the evolution of the pandemic.
  • Researchers from the Social Link Analytics group collaborate on a project that investigates the dynamics of spreading false news about health on social networks in Spain. The objective is to understand the mechanisms of its dissemination, in order to develop and disseminate guidelines that serve to counteract this phenomenon. The study is led by the University of Navarra and has been selected to obtain funding from the BBVA Foundation Program of Aid to Scientific Research Teams 2019.

MareNostrum 4 and users support

The MareNostrum 4 supercomputer, which despite the current circumstances is still in full operation, provides the necessary computational capacity to accelerate ongoing investigations against the coronavirus.

The BSC uses it for its own research, but the center also made it available to research teams or external entities that need high-performance computing for their research against the coronavirus, either through the allocation of reserved resources. to the BSC or through the calls for researchers that are made through the Spanish Network of Supercomputing (RES) or the European network PRACE.

PRACE has opened an extraordinary call for investigations at COVID-19 and the RES has reserved 50% of the resources (including 20% of the MareNostrum 4 supercomputer) for research related to the pandemic.

The BSC Operations Department provides support in the use of the MareNostrum 4, both to internal and external researchers.

Some of the external projects to which the BSC provides infrastructure and user service support are:

  • Hosting of a chest X-ray medical imaging data set to collaborate in the development of open source artificial intelligence tools to aid the early detection of COVID-19 infection and pneumonia. The data set has been prepared by the Bioinformatics and Statistics Unit of the Príncipe Felipe Research Center with data from different hospitals affiliated to the Medical Image Bank of the Valencian Community.
     
  • Molecular dynamics and sequence design simulations for the optimization of antibodies against SARS-COV-2. Collaboration between researchers from the Physics-Chemistry Department of the University of Barcelona at the National Center for Biotechnology and the University of Edinburgh.

Tuñón will use 12,000 processors (7.3% of the MareNostrum's capacity) for two and a half months to run batches of ten simulations at a time. In total, he will use 23.3 million processor hours to perform some 400 simulations.

The UV researcher has obtained access to MareNostrum 4 through the PRACE COVID-19 Fast Track Call, an extraordinary procedure by which the most powerful public supercomputers in Europe have been made available to research on COVID-19, which are included in the European infrastructure PRACE.