Mostrar el registro sencillo del ítem
dc.contributor.author | Carratalá-Sáez, Rocío | es_ES |
dc.contributor.author | Christophersen, Sven | es_ES |
dc.contributor.author | Aliaga, José I. | es_ES |
dc.contributor.author | Beltrán, Vicenç | es_ES |
dc.contributor.author | Börm, Steffen | es_ES |
dc.contributor.author | Quintana Ortí, Enrique Salvador | es_ES |
dc.date.accessioned | 2021-02-06T04:33:59Z | |
dc.date.available | 2021-02-06T04:33:59Z | |
dc.date.issued | 2019-04 | es_ES |
dc.identifier.issn | 1877-7503 | es_ES |
dc.identifier.uri | http://hdl.handle.net/10251/160841 | |
dc.description.abstract | [EN] We address the parallelization of the LU factorization of hierarchical matrices (H-matrices) arising from boundary element methods. Our approach exploits task-parallelism via the OmpSs programming model and runtime, which discovers the data-flow parallelism intrinsic to the operation at execution time, via the analysis of data dependencies based on the memory addresses of the tasks' operands. This is especially challenging for H-matrices, as the structures containing the data vary in dimension during the execution. We tackle this issue by decoupling the data structure from that used to detect dependencies. Furthermore, we leverage the support for weak operands and early release of dependencies, recently introduced in OmpSs-2, to accelerate the execution of parallel codes with nested task-parallelism and fine-grain tasks. As a result, we obtain a significant improvement in the parallel performance with respect to our previous work. | es_ES |
dc.description.sponsorship | The researchers from Universidad Jaume I (UJI) were supported by projects CICYT TIN2014-53495-R and TIN2017-82972-R of MINECO and FEDER; project UJI-B2017-46 of UJI; and the FPU program of MECD. | es_ES |
dc.language | Inglés | es_ES |
dc.publisher | Elsevier | es_ES |
dc.relation.ispartof | Journal of Computational Science | es_ES |
dc.rights | Reconocimiento - No comercial - Sin obra derivada (by-nc-nd) | es_ES |
dc.subject | Hierarchical linear algebra | es_ES |
dc.subject | LU factorization | es_ES |
dc.subject | Nested task-parallelism | es_ES |
dc.subject | Task dependencies | es_ES |
dc.subject | Multi-threading | es_ES |
dc.subject | Multicore processors | es_ES |
dc.subject | Boundary element methods (BEM) | es_ES |
dc.subject.classification | ARQUITECTURA Y TECNOLOGIA DE COMPUTADORES | es_ES |
dc.title | Exploiting nested task-parallelism in the H-LU factorization | es_ES |
dc.type | Artículo | es_ES |
dc.identifier.doi | 10.1016/j.jocs.2019.02.004 | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/MINECO//TIN2014-53495-R/ES/COMPUTACION HETEROGENEA DE BAJO CONSUMO/ | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2013-2016/TIN2017-82972-R/ES/TECNICAS ALGORITMICAS PARA COMPUTACION DE ALTO RENDIMIENTO CONSCIENTE DEL CONSUMO ENERGETICO Y RESISTENTE A ERRORES/ | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/UJI//UJI-B2017-46/ | es_ES |
dc.rights.accessRights | Abierto | es_ES |
dc.contributor.affiliation | Universitat Politècnica de València. Departamento de Informática de Sistemas y Computadores - Departament d'Informàtica de Sistemes i Computadors | es_ES |
dc.description.bibliographicCitation | Carratalá-Sáez, R.; Christophersen, S.; Aliaga, JI.; Beltrán, V.; Börm, S.; Quintana Ortí, ES. (2019). Exploiting nested task-parallelism in the H-LU factorization. Journal of Computational Science. 33:20-33. https://doi.org/10.1016/j.jocs.2019.02.004 | es_ES |
dc.description.accrualMethod | S | es_ES |
dc.relation.publisherversion | https://doi.org/10.1016/j.jocs.2019.02.004 | es_ES |
dc.description.upvformatpinicio | 20 | es_ES |
dc.description.upvformatpfin | 33 | es_ES |
dc.type.version | info:eu-repo/semantics/publishedVersion | es_ES |
dc.description.volume | 33 | es_ES |
dc.relation.pasarela | S\390054 | es_ES |
dc.contributor.funder | Universitat Jaume I | es_ES |
dc.contributor.funder | European Regional Development Fund | es_ES |
dc.contributor.funder | Ministerio de Economía y Competitividad | es_ES |
dc.contributor.funder | Ministerio de Educación, Cultura y Deporte | es_ES |
dc.contributor.funder | Comisión Interministerial de Ciencia y Tecnología | es_ES |
dc.contributor.funder | Agencia Estatal de Investigación | es_ES |
dc.description.references | Hackbusch, W. (1999). A Sparse Matrix Arithmetic Based on $\Cal H$ -Matrices. Part I: Introduction to ${\Cal H}$ -Matrices. Computing, 62(2), 89-108. doi:10.1007/s006070050015 | es_ES |
dc.description.references | Grasedyck, L., & Hackbusch, W. (2003). Construction and Arithmetics of H -Matrices. Computing, 70(4), 295-334. doi:10.1007/s00607-003-0019-1 | es_ES |
dc.description.references | Dongarra, J. J., Du Croz, J., Hammarling, S., & Duff, I. S. (1990). A set of level 3 basic linear algebra subprograms. ACM Transactions on Mathematical Software, 16(1), 1-17. doi:10.1145/77626.79170 | es_ES |
dc.description.references | Buttari, A., Langou, J., Kurzak, J., & Dongarra, J. (2009). A class of parallel tiled linear algebra algorithms for multicore architectures. Parallel Computing, 35(1), 38-53. doi:10.1016/j.parco.2008.10.002 | es_ES |
dc.description.references | Quintana-Ortí, G., Quintana-Ortí, E. S., Geijn, R. A. V. D., Zee, F. G. V., & Chan, E. (2009). Programming matrix algorithms-by-blocks for thread-level parallelism. ACM Transactions on Mathematical Software, 36(3), 1-26. doi:10.1145/1527286.1527288 | es_ES |
dc.description.references | Badia, R. M., Herrero, J. R., Labarta, J., Pérez, J. M., Quintana-Ortí, E. S., & Quintana-Ortí, G. (2009). Parallelizing dense and banded linear algebra libraries using SMPSs. Concurrency and Computation: Practice and Experience, 21(18), 2438-2456. doi:10.1002/cpe.1463 | es_ES |
dc.description.references | Aliaga, J. I., Badia, R. M., Barreda, M., Bollhofer, M., & Quintana-Orti, E. S. (2014). Leveraging Task-Parallelism with OmpSs in ILUPACK’s Preconditioned CG Method. 2014 IEEE 26th International Symposium on Computer Architecture and High Performance Computing. doi:10.1109/sbac-pad.2014.24 | es_ES |
dc.description.references | Agullo, E., Buttari, A., Guermouche, A., & Lopez, F. (2016). Implementing Multifrontal Sparse Solvers for Multicore Architectures with Sequential Task Flow Runtime Systems. ACM Transactions on Mathematical Software, 43(2), 1-22. doi:10.1145/2898348 | es_ES |
dc.description.references | Aliaga, J. I., Carratala-Saez, R., Kriemann, R., & Quintana-Orti, E. S. (2017). Task-Parallel LU Factorization of Hierarchical Matrices Using OmpSs. 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). doi:10.1109/ipdpsw.2017.124 | es_ES |
dc.description.references | The OpenMP API specification for parallel programming, http://www.openmp.org/. | es_ES |
dc.description.references | OmpSs project home page, http://pm.bsc.es/ompss. | es_ES |
dc.description.references | Perez, J. M., Beltran, V., Labarta, J., & Ayguade, E. (2017). Improving the Integration of Task Nesting and Dependencies in OpenMP. 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS). doi:10.1109/ipdps.2017.69 | es_ES |
dc.description.references | HLIBpro library home page, https://www.hlibpro.com/. | es_ES |
dc.description.references | Bempp library home page, https://bempp.com/. | es_ES |
dc.description.references | HACApK library github repository, https://github.com/hoshino-UTokyo/hacapk-gpu. | es_ES |
dc.description.references | hmglib library github repository, https://github.com/zaspel/hmglib. | es_ES |
dc.description.references | HiCMA library github repository, https://github.com/ecrc/hicma. | es_ES |
dc.description.references | Hackbusch, W., & Börm, S. (2002). -matrix approximation of integral operators by interpolation. Applied Numerical Mathematics, 43(1-2), 129-143. doi:10.1016/s0168-9274(02)00121-6 | es_ES |