A Computer-implemented and Reference-free method for identifying variants in Nucleic Acid Sequences

 Status:
Search and Examination
 Priority date:
 Inventor:
David Carrera PerezJorda PoloNicola CadenelliDavid Torrents ArenalesMerce Planas Felix
 Applicant:
Barcelona Supercomputing Center - Centro Nacional De Supercomputacion (BSC-CNS), Institució Catalana de Recerca I Estudis Avançats (ICREA), Universitat Politècnica de Catalunya (UPC)

Abstract

There is provided a computer-implemented method for identifying of nucleic acid variants between two cells, such as a normal cell vs. a pathological cell of a patient, or a cell at two different stages of development. The method is alignment-free, as it does not depend on the use of a reference genome, and is based on the generation and comparison of polymorphic k-mers derived from the nucleotide sequence reads of both biological states. The invention accurately identifies all sorts of genetic variants, ranging from single nucleotide substitutions (SNVs) to large structural variants with great sensitivity and specificity. As a major novelty, it also identifies non-human insertions, such as those derived from retroviruses. Altogether, this invention allows the integration with specific hardware architectures in order to speed up the executions to an unprecedented level.