- -

Exploiting nested task-parallelism in the H-LU factorization

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

Exploiting nested task-parallelism in the H-LU factorization

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author Carratalá-Sáez, Rocío es_ES
dc.contributor.author Christophersen, Sven es_ES
dc.contributor.author Aliaga, José I. es_ES
dc.contributor.author Beltrán, Vicenç es_ES
dc.contributor.author Börm, Steffen es_ES
dc.contributor.author Quintana Ortí, Enrique Salvador es_ES
dc.date.accessioned 2021-02-06T04:33:59Z
dc.date.available 2021-02-06T04:33:59Z
dc.date.issued 2019-04 es_ES
dc.identifier.issn 1877-7503 es_ES
dc.identifier.uri http://hdl.handle.net/10251/160841
dc.description.abstract [EN] We address the parallelization of the LU factorization of hierarchical matrices (H-matrices) arising from boundary element methods. Our approach exploits task-parallelism via the OmpSs programming model and runtime, which discovers the data-flow parallelism intrinsic to the operation at execution time, via the analysis of data dependencies based on the memory addresses of the tasks' operands. This is especially challenging for H-matrices, as the structures containing the data vary in dimension during the execution. We tackle this issue by decoupling the data structure from that used to detect dependencies. Furthermore, we leverage the support for weak operands and early release of dependencies, recently introduced in OmpSs-2, to accelerate the execution of parallel codes with nested task-parallelism and fine-grain tasks. As a result, we obtain a significant improvement in the parallel performance with respect to our previous work. es_ES
dc.description.sponsorship The researchers from Universidad Jaume I (UJI) were supported by projects CICYT TIN2014-53495-R and TIN2017-82972-R of MINECO and FEDER; project UJI-B2017-46 of UJI; and the FPU program of MECD. es_ES
dc.language Inglés es_ES
dc.publisher Elsevier es_ES
dc.relation.ispartof Journal of Computational Science es_ES
dc.rights Reconocimiento - No comercial - Sin obra derivada (by-nc-nd) es_ES
dc.subject Hierarchical linear algebra es_ES
dc.subject LU factorization es_ES
dc.subject Nested task-parallelism es_ES
dc.subject Task dependencies es_ES
dc.subject Multi-threading es_ES
dc.subject Multicore processors es_ES
dc.subject Boundary element methods (BEM) es_ES
dc.subject.classification ARQUITECTURA Y TECNOLOGIA DE COMPUTADORES es_ES
dc.title Exploiting nested task-parallelism in the H-LU factorization es_ES
dc.type Artículo es_ES
dc.identifier.doi 10.1016/j.jocs.2019.02.004 es_ES
dc.relation.projectID info:eu-repo/grantAgreement/MINECO//TIN2014-53495-R/ES/COMPUTACION HETEROGENEA DE BAJO CONSUMO/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2013-2016/TIN2017-82972-R/ES/TECNICAS ALGORITMICAS PARA COMPUTACION DE ALTO RENDIMIENTO CONSCIENTE DEL CONSUMO ENERGETICO Y RESISTENTE A ERRORES/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/UJI//UJI-B2017-46/ es_ES
dc.rights.accessRights Abierto es_ES
dc.contributor.affiliation Universitat Politècnica de València. Departamento de Informática de Sistemas y Computadores - Departament d'Informàtica de Sistemes i Computadors es_ES
dc.description.bibliographicCitation Carratalá-Sáez, R.; Christophersen, S.; Aliaga, JI.; Beltrán, V.; Börm, S.; Quintana Ortí, ES. (2019). Exploiting nested task-parallelism in the H-LU factorization. Journal of Computational Science. 33:20-33. https://doi.org/10.1016/j.jocs.2019.02.004 es_ES
dc.description.accrualMethod S es_ES
dc.relation.publisherversion https://doi.org/10.1016/j.jocs.2019.02.004 es_ES
dc.description.upvformatpinicio 20 es_ES
dc.description.upvformatpfin 33 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.description.volume 33 es_ES
dc.relation.pasarela S\390054 es_ES
dc.contributor.funder Universitat Jaume I es_ES
dc.contributor.funder European Regional Development Fund es_ES
dc.contributor.funder Ministerio de Economía y Competitividad es_ES
dc.contributor.funder Ministerio de Educación, Cultura y Deporte es_ES
dc.contributor.funder Comisión Interministerial de Ciencia y Tecnología es_ES
dc.contributor.funder Agencia Estatal de Investigación es_ES
dc.description.references Hackbusch, W. (1999). A Sparse Matrix Arithmetic Based on $\Cal H$ -Matrices. Part I: Introduction to ${\Cal H}$ -Matrices. Computing, 62(2), 89-108. doi:10.1007/s006070050015 es_ES
dc.description.references Grasedyck, L., & Hackbusch, W. (2003). Construction and Arithmetics of H -Matrices. Computing, 70(4), 295-334. doi:10.1007/s00607-003-0019-1 es_ES
dc.description.references Dongarra, J. J., Du Croz, J., Hammarling, S., & Duff, I. S. (1990). A set of level 3 basic linear algebra subprograms. ACM Transactions on Mathematical Software, 16(1), 1-17. doi:10.1145/77626.79170 es_ES
dc.description.references Buttari, A., Langou, J., Kurzak, J., & Dongarra, J. (2009). A class of parallel tiled linear algebra algorithms for multicore architectures. Parallel Computing, 35(1), 38-53. doi:10.1016/j.parco.2008.10.002 es_ES
dc.description.references Quintana-Ortí, G., Quintana-Ortí, E. S., Geijn, R. A. V. D., Zee, F. G. V., & Chan, E. (2009). Programming matrix algorithms-by-blocks for thread-level parallelism. ACM Transactions on Mathematical Software, 36(3), 1-26. doi:10.1145/1527286.1527288 es_ES
dc.description.references Badia, R. M., Herrero, J. R., Labarta, J., Pérez, J. M., Quintana-Ortí, E. S., & Quintana-Ortí, G. (2009). Parallelizing dense and banded linear algebra libraries using SMPSs. Concurrency and Computation: Practice and Experience, 21(18), 2438-2456. doi:10.1002/cpe.1463 es_ES
dc.description.references Aliaga, J. I., Badia, R. M., Barreda, M., Bollhofer, M., & Quintana-Orti, E. S. (2014). Leveraging Task-Parallelism with OmpSs in ILUPACK’s Preconditioned CG Method. 2014 IEEE 26th International Symposium on Computer Architecture and High Performance Computing. doi:10.1109/sbac-pad.2014.24 es_ES
dc.description.references Agullo, E., Buttari, A., Guermouche, A., & Lopez, F. (2016). Implementing Multifrontal Sparse Solvers for Multicore Architectures with Sequential Task Flow Runtime Systems. ACM Transactions on Mathematical Software, 43(2), 1-22. doi:10.1145/2898348 es_ES
dc.description.references Aliaga, J. I., Carratala-Saez, R., Kriemann, R., & Quintana-Orti, E. S. (2017). Task-Parallel LU Factorization of Hierarchical Matrices Using OmpSs. 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). doi:10.1109/ipdpsw.2017.124 es_ES
dc.description.references The OpenMP API specification for parallel programming, http://www.openmp.org/. es_ES
dc.description.references OmpSs project home page, http://pm.bsc.es/ompss. es_ES
dc.description.references Perez, J. M., Beltran, V., Labarta, J., & Ayguade, E. (2017). Improving the Integration of Task Nesting and Dependencies in OpenMP. 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS). doi:10.1109/ipdps.2017.69 es_ES
dc.description.references HLIBpro library home page, https://www.hlibpro.com/. es_ES
dc.description.references Bempp library home page, https://bempp.com/. es_ES
dc.description.references HACApK library github repository, https://github.com/hoshino-UTokyo/hacapk-gpu. es_ES
dc.description.references hmglib library github repository, https://github.com/zaspel/hmglib. es_ES
dc.description.references HiCMA library github repository, https://github.com/ecrc/hicma. es_ES
dc.description.references Hackbusch, W., & Börm, S. (2002). -matrix approximation of integral operators by interpolation. Applied Numerical Mathematics, 43(1-2), 129-143. doi:10.1016/s0168-9274(02)00121-6 es_ES


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem