BSC researchers bring the power of Artificial Intelligence to the small datasets of rare diseases

13 April 2021
They have developed a new method for the characterization of rare diseases and applied it to the study of a childhood brain tumor.

The research published today in Cell iScience has been developed in the framework of the European iPC project

This work opens up a new avenue for the development of computational methods that are designed specifically for rare diseases

The researchers from the Barcelona Supercomuting Center (BSC) Iker Núñez-Carpintero and Davide Cirillo, from the Computational Biology group led by Alfonso Valencia, along with Marianyela Petrizzelli and Andrei Zinovyev from the Institut Curie, have developed a novel method for the characterization of rare diseases and applied it to the study of a childhood brain tumor called medulloblastoma. The article “The multilayer community structure of medulloblastoma” has been published today in Cell iScience and represents a pivotal research achievement for the European project iPC: individualizedPaediatricCure.

Rare diseases, such as pediatric tumors, represent a very relevant medical and human problem and at the same time the analysis of their molecular causes is a particularly complex scientific problem. Most powerful Artificial Intelligence technologies are designed for the analysis of large data sets and not for the small number of patients typical of these diseases.

The new methodology presented has a great potential for the study of other rare diseases. A disease is defined as rare when it affects a very small portion of the population. Most rare diseases are genetic and appear in childhood. Due to the rarity of such conditions, a major issue in the study of rare diseases is the small sample sizes, which prevents statistically firm conclusions about any relevant findings. Moreover, small sample sizes exclude the use of Artificial Intelligence approaches that feed on a large number of training examples. The same authors from BSC discussed this problem in a recent review article published in Molecular Oncology in February this year, which laid the foundations for this research.

This work opens up a new avenue for the development of computational methods that are designed specifically for rare diseases. Indeed, beyond the current results, this effort highlights the relevance and urgency of implementing computational solutions for the complex scenario of rare diseases where standard approaches developed for common diseases and cancer fail. This research represents a step forward not only in the identification of distinct attributes of medulloblastoma subtypes but, in the long term, in the use of multilayer networks for the analysis of rare diseases.

Medulloblastoma is a rare embryonic tumor with unknown causes. Despite being rare, it is the most common cancerous brain tumor in children. The disease can be classified according to molecular criteria into four subgroups: WNT-activated, SHH-activated, Group 3, and Group 4. By using an optimization procedure on a complex network representation of biomedical knowledge, called multilayer network, the researchers studied the associations among genes in large amounts of heterogeneous information, including protein interactions, drug targets, genetic variants, cellular pathways, and metabolic reactions. Their method was able to identify the minimal number of genes that optimally classify the patients of two independent medulloblastoma cohorts with multi-omics datasets. The method achieved a high accuracy in patient stratification and a high dimensionality reduction in the identification of those genes that are sufficient to define the medulloblastoma subgroups and suggest new ones.

 

Article: The multilayer community structure of medulloblastoma

DOI:https://doi.org/10.1016/j.isci.2021.102365