Mostrar el registro sencillo del ítem
dc.contributor.author | Salavert Torres, José | es_ES |
dc.contributor.author | Tomás Domínguez, Andrés Enrique | es_ES |
dc.contributor.author | Tárraga Giménez, Joaquín | es_ES |
dc.contributor.author | Medina Castelló, Ignacio | es_ES |
dc.contributor.author | Dopazo Blazquez, Joaquin | es_ES |
dc.contributor.author | Blanquer Espert, Ignacio | es_ES |
dc.date.accessioned | 2015-06-26T18:11:39Z | |
dc.date.available | 2015-06-26T18:11:39Z | |
dc.date.issued | 2015-01-28 | |
dc.identifier.issn | 1471-2105 | |
dc.identifier.uri | http://hdl.handle.net/10251/52360 | |
dc.description.abstract | Background: Short sequence mapping methods for Next Generation Sequencing consist on a combination of seeding techniques followed by local alignment based on dynamic programming approaches. Most seeding algorithms are based on backward search alignment, using the Burrows Wheeler Transform, the Ferragina and Manzini Index or Suffix Arrays. All these backward search algorithms have excellent performance, but their computational cost highly increases when allowing errors. In this paper, we discuss an inexact mapping algorithm based on pruning strategies for search tree exploration over genomic data. Results: The proposed algorithm achieves a 13x speed-up over similar algorithms when allowing 6 base errors, including insertions, deletions and mismatches. This algorithm can deal with 400 bps reads with up to 9 errors in a high quality Illumina dataset. In this example, the algorithm works as a preprocessor that reduces by 55% the number of reads to be aligned. Depending on the aligner the overall execution time is reduced between 20–40%. Conclusions: Although not intended as a complete sequence mapping tool, the proposed algorithm could be used as a preprocessing step to modern sequence mappers. This step significantly reduces the number reads to be aligned, accelerating overall alignment time. Furthermore, this algorithm could be used for accelerating the seeding step of already available sequence mappers. In addition, an out-of-core index has been implemented for working with large genomes on systems without expensive memory configurations. | es_ES |
dc.description.sponsorship | The authors would like to thank the Universitat Politecnica de Valencia (Spain) in the frame of the grant "High-performance tools for the alignment of genetic sequences using graphic accelerators (GPGPUs)/Herramientas de altas prestaciones para el alineamiento de secuencias geneticas mediante el uso de aceleradores graficos (GPGPUs)", research program PAID-06-11, code 2025. | en_EN |
dc.language | Inglés | es_ES |
dc.publisher | BioMed Central | es_ES |
dc.relation.ispartof | BMC Bioinformatics | es_ES |
dc.rights | Reconocimiento (by) | es_ES |
dc.subject | Sequence mapping | es_ES |
dc.subject | Burrows-wheeler | es_ES |
dc.subject | FM-Index | es_ES |
dc.subject | Suffix array | es_ES |
dc.subject.classification | CIENCIAS DE LA COMPUTACION E INTELIGENCIA ARTIFICIAL | es_ES |
dc.subject.classification | LENGUAJES Y SISTEMAS INFORMATICOS | es_ES |
dc.title | Fast inexact mapping using advanced tree exploration on backward search methods | es_ES |
dc.type | Artículo | es_ES |
dc.identifier.doi | 10.1186/s12859-014-0438-3 | |
dc.relation.projectID | info:eu-repo/grantAgreement/UPV//PAID-06-11-2025/ | es_ES |
dc.rights.accessRights | Abierto | es_ES |
dc.contributor.affiliation | Universitat Politècnica de València. Instituto de Instrumentación para Imagen Molecular - Institut d'Instrumentació per a Imatge Molecular | es_ES |
dc.contributor.affiliation | Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació | es_ES |
dc.description.bibliographicCitation | Salavert Torres, J.; Tomás Domínguez, AE.; Tárraga Giménez, J.; Medina Castelló, I.; Dopazo Blazquez, J.; Blanquer Espert, I. (2015). Fast inexact mapping using advanced tree exploration on backward search methods. BMC Bioinformatics. 16(18):1-11. https://doi.org/10.1186/s12859-014-0438-3 | es_ES |
dc.description.accrualMethod | S | es_ES |
dc.relation.publisherversion | http://dx.doi.org/10.1186/s12859-014-0438-3 | es_ES |
dc.description.upvformatpinicio | 1 | es_ES |
dc.description.upvformatpfin | 11 | es_ES |
dc.type.version | info:eu-repo/semantics/publishedVersion | es_ES |
dc.description.volume | 16 | es_ES |
dc.description.issue | 18 | es_ES |
dc.relation.senia | 285402 | |
dc.identifier.pmid | 25626517 | en_EN |
dc.identifier.pmcid | PMC4384339 | en_EN |
dc.contributor.funder | Universitat Politècnica de València | es_ES |
dc.description.references | Li H, Homer N. A survey of sequence alignment algorithms for next-generation sequencing. Brief Biol. 2010; 11(5):473–83. | es_ES |
dc.description.references | Smith TF, Waterman MS. Identification of common molecular subsequences. J Mol Biol. 1981; 147:195–7. | es_ES |
dc.description.references | Gotoh O. An improved algorithm for matching biological sequences. J Mol Biol. 1982; 162:705–8. | es_ES |
dc.description.references | Durbin R, Eddy SR, Krogh A, Mitchison G. Biological sequence analysis: probabilistic models of proteins and nucleic acids. Cambridge University Press: Cambridge; 1998. [ http://books.google.es/books?id=R5P2GlJvigQC ] | es_ES |
dc.description.references | Ferragina P, Manzini G. Indexing compressed text. J ACM. 2005; 52(4):552–81. doi:10.1145/10820361082039 | es_ES |
dc.description.references | Burrows M, Wheeler DJ. A block-sorting lossless data compression algorithm. Technical Report 124. (SRC Digital, DEC Palo Alto); May 1994 | es_ES |
dc.description.references | Manzini G. An analysis of the burrows-wheeler transform. In: Proceedings of the Tenth Annual ACM-SIAM Symposium on Discrete Algorithms. NY: ACM-SIAM: 1999. p. 669–77. | es_ES |
dc.description.references | Ferragina P, Manzini G. Opportunistic data structures with applications. In: FOCS: 2000. p. 390–398. | es_ES |
dc.description.references | Li H, Durbin R. Fast and accurate short read alignment with burrows-wheeler transform. Bioinformatics. 2009; 25(14):1754–1760. | es_ES |
dc.description.references | Li R, Yu C, Li Y, Lam T-W, Yiu S-M, Kristiansen K, et al. Soap2: an improved ultrafast tool for short read alignment. Bioinformatics. 2009; 25(15):1966–1967. doi:10.1093/bioinformatics/btp336. | es_ES |
dc.description.references | Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009; 10:(R25). | es_ES |
dc.description.references | Luo R, Wong T, Zhu J, Liu C-M, Zhu X, Wu E, et al. Soap3-dp: Fast, accurate and sensitive gpu-based short read aligner. PLoS ONE. 2013; 8(5):65632. doi:10.1371/journal.pone.0065632 | es_ES |
dc.description.references | Liu Y, Schmidt B. Long read alignment based on maximal exact match seeds. Bioinformatics. 2012; 28(18):318–324. doi:10.1093/bioinformatics/bts414 | es_ES |
dc.description.references | Klus P, Lam S, Lyberg D, Cheung M, Pullan G, McFarlane I, et al. Barracuda - a fast short read sequence aligner using graphics processing units. BMC Res Notes. 2012; 5(1):27. doi:10.1186/1756-0500-5-27 | es_ES |
dc.description.references | Salavert J, Blanquer I, Andrés T, Vicente H, Ignacio M, Joaquín T, et al. Using gpus for the exact alignment of short-read genetic sequences by means of the burrows-wheeler transform. IEEE/ACM Trans Comput Biol Bioinf. 2012; 9(4):1245–56. doi:10.1109/TCBB.2012.49 | es_ES |
dc.description.references | Xin Y, Liu B, Min B, Li WXY, Cheung RCC, Fong AS, et al. Parallel architecture for {DNA} sequence inexact matching with burrows-wheeler transform. Microelectron J. 2013; 44(8):670–82. doi:10.1016/j.mejo.2013.05.004 | es_ES |
dc.description.references | Manber U, Myers G. Suffix arrays: A new method for on-line string searches. In: Proceedings of the First Annual ACM-SIAM Symposium on Discrete Algorithms. SODA ’90Philadelphia, PA, USA: Society for Industrial and Applied Mathematics: 1990. p. 319–327. http://dl.acm.org/citation.cfm?id=320176.320218 | es_ES |
dc.description.references | Abouelhoda MI, Kurtz S, Ohlebusch E. The enhanced suffix array and its applications to genome analysis. In: Proc. Workshop on Algorithms in Bioinformatics, in Lecture Notes in Computer Science,Heidelberger, Berlin: Springer: 2002. p. 449–63. | es_ES |
dc.description.references | Vyverman M, De Baets B, Fack V, Dawyndt P. essamem: finding maximal exact matches using enhanced sparse suffix arrays. Bioinformatics. 2013; 29(6):802–4. doi:10.1093/bioinformatics/btt042 | es_ES |
dc.description.references | Oguzhan Kulekci M, Hon W-K, Shah R, Scott Vitter J, Xu B. Psi-ra: a parallel sparse index for genomic read alignment. BMC Genomics. 2011; 12(Suppl 2):7. doi:10.1186/1471-2164-12-S2-S7 | es_ES |
dc.description.references | Sadakane K. New text indexing functionalities of the compressed suffix arrays. J Algorithms. 2003; 48(2):294–313. doi:10.1016/S0196-6774(03)00087-7 | es_ES |
dc.description.references | Liu C-M, Wong T, Wu E, Luo R, Yiu S-M, Li Y, et al. Soap3: ultra-fast gpu-based parallel alignment tool for short reads. Bioinformatics. 2012; 28(6):878–9. doi:10.1093/bioinformatics/bts061. http://bioinformatics.oxfordjournals.org/content/28/6/878.full.pdf+html | es_ES |
dc.description.references | Lam TW, Li R, Tam A, Wong S, Wu E, Yiu SM. High throughput short read alignment via bi-directional bwt. In: IEEE International Conference On Bioinformatics and Biomedicine, 2009. BIBM ’09.,Washington, D.C., USA: IEEE Computer Society Press: 2009. p. 31–6. doi:10.1109/BIBM.2009.42 | es_ES |
dc.description.references | Li H, Durbin R. Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics. 2010; 26(5):589–95. doi:10.1093/bioinformatics/btp698 | es_ES |
dc.description.references | Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Meth. 2012; 9(4):357–9. doi:10.1038/nmeth.1923 | es_ES |
dc.description.references | Mu JC, Jiang H, Kiani A, Mohiyuddin M, Asadi NB, Wong WH. Fast and accurate read alignment for resequencing. Bioinformatics. 2012; 28(18):2366–73. doi:10.1093/bioinformatics/bts450 | es_ES |
dc.description.references | Ning Z, Cox AJ, Mullikin JC. Ssaha: A fast search method for large dna databases. Genome Res. 2001; 11(10):1725–9. doi:10.1101/gr.194201 | es_ES |
dc.description.references | Marco-Sola S, Sammeth M, Guigo R, Ribeca P. The GEM mapper: fast, accurate and versatile alignment by filtration. Nat Meth. 2012; 9(12):1185–8. doi:10.1038/nmeth.2221 | es_ES |
dc.description.references | Sadakane K. A library for compressed full-text indexes. https://code.google.com/p/csalib/ (2010) | es_ES |
dc.description.references | Mäkinen V, Navarro G, Sadakane K. Advantages of backward searching; efficient secondary memory and distributed implementation of compressed suffix arrays. In: Proceedings of the 15th International Conference on Algorithms and Computation. ISAAC’04,Berlin, Heidelberg: Springer: 2004. p. 681–92. doi:10.1007/978-3-540-30551-4_59. http://dx.doi.org/10.1007/978-3-540-30551-4_59 | es_ES |
dc.description.references | Puglisi SJ, Smyth WF, Turpin AH. A taxonomy of suffix array construction algorithms. ACM Comput Surv. 2007; 39(2). doi:10.1145/1242471.1242472 | es_ES |
dc.description.references | Okanohara D, Sadakane K. A linear-time burrows-wheeler transform using induced sorting. In: Karlgren J, Tarhio J, Hyyrö H, editors. String Processing and Information Retrieval. Lecture Notes in Computer Science, vol. 5721. Heidelberg, Berlin: Springer: 2009. p. 90–101. | es_ES |
dc.description.references | Grossi R, Vitter J. Compressed suffix arrays and suffix trees with applications to text indexing and string matching. SICOMP: SIAM J Comput. 2005; 35(2):378–407. | es_ES |