- -

A framework for genomic sequencing on clusters of multicore and manycore processors

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

A framework for genomic sequencing on clusters of multicore and manycore processors

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author Martínez, Héctor es_ES
dc.contributor.author Barrachina, Sergio es_ES
dc.contributor.author Castillo, Maribel es_ES
dc.contributor.author Tárraga, Joaquín es_ES
dc.contributor.author Medina, Ignacio es_ES
dc.contributor.author Dopazo, Joaquín es_ES
dc.contributor.author Quintana Ortí, Enrique Salvador es_ES
dc.date.accessioned 2020-07-08T03:32:36Z
dc.date.available 2020-07-08T03:32:36Z
dc.date.issued 2018-05 es_ES
dc.identifier.issn 1094-3420 es_ES
dc.identifier.uri http://hdl.handle.net/10251/147637
dc.description.abstract [EN] The advances in genomic sequencing during the past few years have motivated the development of fast and reliable software for DNA/RNA sequencing on current high performance architectures. Most of these efforts target multicore processors, only a few can also exploit graphics processing units, and a much smaller set will run in clusters equipped with any of these multi-threaded architecture technologies. Furthermore, the examples that can be used on clusters today are all strongly coupled with a particular aligner. In this paper we introduce an alignment framework that can be leveraged to coordinately run any single-node aligner, taking advantage of the resources of a cluster without having to modify any portion of the original software. The key to our transparent migration lies in hiding the complexity associated with the multi-node execution (such as coordinating the processes running in the cluster nodes) inside the generic-aligner framework. Moreover, following the design and operation in our Message Passing Interface (MPI) version of HPG Aligner RNA BWT, we organize the framework into two stages in order to be able to execute different aligners in each one of them. With this configuration, for example, the first stage can ideally apply a fast aligner to accelerate the process, while the second one can be tuned to act as a refinement stage that further improves the global alignment process with little cost. es_ES
dc.description.sponsorship The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The researchers from the University Jaume I were supported by the MINECO/CICYT (grant numbers TIN2011-23283 and TIN2014-53495-R) and FEDER. es_ES
dc.language Inglés es_ES
dc.publisher SAGE Publications es_ES
dc.relation.ispartof International Journal of High Performance Computing Applications es_ES
dc.rights Reserva de todos los derechos es_ES
dc.subject Genomic sequencing es_ES
dc.subject DNA-seq es_ES
dc.subject RNA-seq es_ES
dc.subject High performance computing es_ES
dc.subject Clusters es_ES
dc.subject Multi-threaded architectures es_ES
dc.subject.classification ARQUITECTURA Y TECNOLOGIA DE COMPUTADORES es_ES
dc.title A framework for genomic sequencing on clusters of multicore and manycore processors es_ES
dc.type Artículo es_ES
dc.identifier.doi 10.1177/1094342016653243 es_ES
dc.relation.projectID info:eu-repo/grantAgreement/MINECO//TIN2014-53495-R/ES/COMPUTACION HETEROGENEA DE BAJO CONSUMO/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/MICINN//TIN2011-23283/ES/POWER-AWARE HIGH PERFORMANCE COMPUTING/ es_ES
dc.rights.accessRights Abierto es_ES
dc.contributor.affiliation Universitat Politècnica de València. Departamento de Informática de Sistemas y Computadores - Departament d'Informàtica de Sistemes i Computadors es_ES
dc.description.bibliographicCitation Martínez, H.; Barrachina, S.; Castillo, M.; Tárraga, J.; Medina, I.; Dopazo, J.; Quintana Ortí, ES. (2018). A framework for genomic sequencing on clusters of multicore and manycore processors. International Journal of High Performance Computing Applications. 32(3):393-406. https://doi.org/10.1177/1094342016653243 es_ES
dc.description.accrualMethod S es_ES
dc.relation.publisherversion https://doi.org/10.1177/1094342016653243 es_ES
dc.description.upvformatpinicio 393 es_ES
dc.description.upvformatpfin 406 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.description.volume 32 es_ES
dc.description.issue 3 es_ES
dc.relation.pasarela S\380787 es_ES
dc.contributor.funder European Regional Development Fund es_ES
dc.contributor.funder Ministerio de Economía y Competitividad es_ES
dc.contributor.funder Ministerio de Ciencia e Innovación es_ES
dc.description.references Biesecker, L. G. (2010). Exome sequencing makes medical genomics a reality. Nature Genetics, 42(1), 13-14. doi:10.1038/ng0110-13 es_ES
dc.description.references Burrows M, Wheeler D (1994) A block sorting lossless data compression algorithm. Technical report 124, Palo Alto: Digital Equipment Corporation. es_ES
dc.description.references Cock, P. J. A., Fields, C. J., Goto, N., Heuer, M. L., & Rice, P. M. (2009). The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Research, 38(6), 1767-1771. doi:10.1093/nar/gkp1137 es_ES
dc.description.references Dobin, A., Davis, C. A., Schlesinger, F., Drenkow, J., Zaleski, C., Jha, S., … Gingeras, T. R. (2012). STAR: ultrafast universal RNA-seq aligner. Bioinformatics, 29(1), 15-21. doi:10.1093/bioinformatics/bts635 es_ES
dc.description.references Ferragina, P., & Manzini, G. (s. f.). Opportunistic data structures with applications. Proceedings 41st Annual Symposium on Foundations of Computer Science. doi:10.1109/sfcs.2000.892127 es_ES
dc.description.references Garber, M., Grabherr, M. G., Guttman, M., & Trapnell, C. (2011). Computational methods for transcriptome annotation and quantification using RNA-seq. Nature Methods, 8(6), 469-477. doi:10.1038/nmeth.1613 es_ES
dc.description.references Grant, G. R., Farkas, M. H., Pizarro, A. D., Lahens, N. F., Schug, J., Brunk, B. P., … Pierce, E. A. (2011). Comparative analysis of RNA-Seq alignment algorithms and the RNA-Seq unified mapper (RUM). Bioinformatics, 27(18), 2518-2528. doi:10.1093/bioinformatics/btr427 es_ES
dc.description.references Kim, D., Pertea, G., Trapnell, C., Pimentel, H., Kelley, R., & Salzberg, S. L. (2013). TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biology, 14(4), R36. doi:10.1186/gb-2013-14-4-r36 es_ES
dc.description.references Langmead, B., & Salzberg, S. L. (2012). Fast gapped-read alignment with Bowtie 2. Nature Methods, 9(4), 357-359. doi:10.1038/nmeth.1923 es_ES
dc.description.references Langmead, B., Trapnell, C., Pop, M., & Salzberg, S. L. (2009). Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology, 10(3), R25. doi:10.1186/gb-2009-10-3-r25 es_ES
dc.description.references Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., … Homer, N. (2009). The Sequence Alignment/Map format and SAMtools. Bioinformatics, 25(16), 2078-2079. doi:10.1093/bioinformatics/btp352 es_ES
dc.description.references Li, H., & Homer, N. (2010). A survey of sequence alignment algorithms for next-generation sequencing. Briefings in Bioinformatics, 11(5), 473-483. doi:10.1093/bib/bbq015 es_ES
dc.description.references Yongchao Liu, & Schmidt, B. (2014). CUSHAW2-GPU: Empowering Faster Gapped Short-Read Alignment Using GPU Computing. IEEE Design & Test, 31(1), 31-39. doi:10.1109/mdat.2013.2284198 es_ES
dc.description.references Liu, Y., Popp, B., & Schmidt, B. (2014). CUSHAW3: Sensitive and Accurate Base-Space and Color-Space Short-Read Alignment with Hybrid Seeding. PLoS ONE, 9(1), e86869. doi:10.1371/journal.pone.0086869 es_ES
dc.description.references Manber, U., & Myers, G. (1993). Suffix Arrays: A New Method for On-Line String Searches. SIAM Journal on Computing, 22(5), 935-948. doi:10.1137/0222058 es_ES
dc.description.references Martinez, H., Barrachina, S., Castillo, M., Tarraga, J., Medina, I., Dopazo, J., & Quintana-Orti, E. S. (2015). Scalable RNA Sequencing on Clusters of Multicore Processors. 2015 IEEE Trustcom/BigDataSE/ISPA. doi:10.1109/trustcom.2015.631 es_ES
dc.description.references Martínez, H., Tárraga, J., Medina, I., Barrachina, S., Castillo, M., Dopazo, J., & Quintana-Ortí, E. S. (2013). A dynamic pipeline for RNA sequencing on multicore processors. Proceedings of the 20th European MPI Users’ Group Meeting on - EuroMPI ’13. doi:10.1145/2488551.2488581 es_ES
dc.description.references Martinez, H., Tarraga, J., Medina, I., Barrachina, S., Castillo, M., Dopazo, J., & Quintana-Orti, E. S. (2015). Concurrent and Accurate Short Read Mapping on Multicore Processors. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 12(5), 995-1007. doi:10.1109/tcbb.2015.2392077 es_ES
dc.description.references Smith, T. F., & Waterman, M. S. (1981). Identification of common molecular subsequences. Journal of Molecular Biology, 147(1), 195-197. doi:10.1016/0022-2836(81)90087-5 es_ES
dc.description.references Tárraga, J., Arnau, V., Martínez, H., Moreno, R., Cazorla, D., Salavert-Torres, J., … Medina, I. (2014). Acceleration of short and long DNA read mapping without loss of accuracy using suffix array. Bioinformatics, 30(23), 3396-3398. doi:10.1093/bioinformatics/btu553 es_ES
dc.description.references Wang, K., Singh, D., Zeng, Z., Coleman, S. J., Huang, Y., Savich, G. L., … Liu, J. (2010). MapSplice: Accurate mapping of RNA-seq reads for splice junction discovery. Nucleic Acids Research, 38(18), e178-e178. doi:10.1093/nar/gkq622 es_ES


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem