A Computer-implemented and Reference-free method for identifying variants in Nucleic Acid Sequences

Search and Examination
 Publication number:
 Priority date:
David Carrera Perez,  JORDA POLO,  NICOLA CADENELLI,  David Torrents Arenales,  MERCE PLANAS FELIX
Barcelona Supercomputing Center - Centro Nacional De Supercomputacion (BSC-CNS), Institució Catalana de Recerca I Estudis Avançats (ICREA), Universitat Politècnica de Catalunya (UPC)


There is provided a computer-implemented method for identifying of nucleic acid variants between two cells, such as a normal cell vs. a pathological cell of a patient, or a cell at two different stages of development. The method is alignment-free, as it does not depend on the use of a reference genome, and is based on the generation and comparison of polymorphic k-mers derived from the nucleotide sequence reads of both biological states. The invention accurately identifies all sorts of genetic variants, ranging from single nucleotide substitutions (SNVs) to large structural variants with great sensitivity and specificity. As a major novelty, it also identifies non-human insertions, such as those derived from retroviruses. Altogether, this invention allows the integration with specific hardware architectures in order to speed up the executions to an unprecedented level. 
The project leading to this patent has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 639595).