Exploiting nested task-parallelism in the H-LU factorization

Carratalá-Sáez, Rocío; Christophersen, Sven; Aliaga, José I.; Beltrán, Vicenç; Börm, Steffen; Quintana Ortí, Enrique Salvador

doi:10.1016/j.jocs.2019.02.004

Identificarse

Buscar en RiuNet

Listar

Todo RiuNet
Esta colección

Mi cuenta

Acceder

Estadísticas

Ver Estadísticas de uso

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Exploiting nested task-parallelism in the H-LU factorization

Mostrar el registro sencillo del ítem

Ficheros en el ítem

Nombre: Carratalá-Sáez;Ch ...

Tamaño: 1.034Mb

Formato: PDF

Descripción: Versión del Autor.

Abrir

Nombre: 1-s2.0-S187775031 ...

Tamaño: 5.825Mb

Formato: PDF

Descripción: Versión editorial

Solicitar una copia al autor

dc.contributor.author	Carratalá-Sáez, Rocío	es_ES
dc.contributor.author	Christophersen, Sven	es_ES
dc.contributor.author	Aliaga, José I.	es_ES
dc.contributor.author	Beltrán, Vicenç	es_ES
dc.contributor.author	Börm, Steffen	es_ES
dc.contributor.author	Quintana Ortí, Enrique Salvador	es_ES
dc.date.accessioned	2021-02-06T04:33:59Z
dc.date.available	2021-02-06T04:33:59Z
dc.date.issued	2019-04	es_ES
dc.identifier.issn	1877-7503	es_ES
dc.identifier.uri	http://hdl.handle.net/10251/160841
dc.description.abstract	[EN] We address the parallelization of the LU factorization of hierarchical matrices (H-matrices) arising from boundary element methods. Our approach exploits task-parallelism via the OmpSs programming model and runtime, which discovers the data-flow parallelism intrinsic to the operation at execution time, via the analysis of data dependencies based on the memory addresses of the tasks' operands. This is especially challenging for H-matrices, as the structures containing the data vary in dimension during the execution. We tackle this issue by decoupling the data structure from that used to detect dependencies. Furthermore, we leverage the support for weak operands and early release of dependencies, recently introduced in OmpSs-2, to accelerate the execution of parallel codes with nested task-parallelism and fine-grain tasks. As a result, we obtain a significant improvement in the parallel performance with respect to our previous work.	es_ES
dc.description.sponsorship	The researchers from Universidad Jaume I (UJI) were supported by projects CICYT TIN2014-53495-R and TIN2017-82972-R of MINECO and FEDER; project UJI-B2017-46 of UJI; and the FPU program of MECD.	es_ES
dc.language	Inglés	es_ES
dc.publisher	Elsevier	es_ES
dc.relation.ispartof	Journal of Computational Science	es_ES
dc.rights	Reconocimiento - No comercial - Sin obra derivada (by-nc-nd)	es_ES
dc.subject	Hierarchical linear algebra	es_ES
dc.subject	LU factorization	es_ES
dc.subject	Nested task-parallelism	es_ES
dc.subject	Task dependencies	es_ES
dc.subject	Multi-threading	es_ES
dc.subject	Multicore processors	es_ES
dc.subject	Boundary element methods (BEM)	es_ES
dc.subject.classification	ARQUITECTURA Y TECNOLOGIA DE COMPUTADORES	es_ES
dc.title	Exploiting nested task-parallelism in the H-LU factorization	es_ES
dc.type	Artículo	es_ES
dc.identifier.doi	10.1016/j.jocs.2019.02.004	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/MINECO//TIN2014-53495-R/ES/COMPUTACION HETEROGENEA DE BAJO CONSUMO/	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2013-2016/TIN2017-82972-R/ES/TECNICAS ALGORITMICAS PARA COMPUTACION DE ALTO RENDIMIENTO CONSCIENTE DEL CONSUMO ENERGETICO Y RESISTENTE A ERRORES/	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/UJI//UJI-B2017-46/	es_ES
dc.rights.accessRights	Abierto	es_ES
dc.contributor.affiliation	Universitat Politècnica de València. Departamento de Informática de Sistemas y Computadores - Departament d'Informàtica de Sistemes i Computadors	es_ES
dc.description.bibliographicCitation	Carratalá-Sáez, R.; Christophersen, S.; Aliaga, JI.; Beltrán, V.; Börm, S.; Quintana Ortí, ES. (2019). Exploiting nested task-parallelism in the H-LU factorization. Journal of Computational Science. 33:20-33. https://doi.org/10.1016/j.jocs.2019.02.004	es_ES
dc.description.accrualMethod	S	es_ES
dc.relation.publisherversion	https://doi.org/10.1016/j.jocs.2019.02.004	es_ES
dc.description.upvformatpinicio	20	es_ES
dc.description.upvformatpfin	33	es_ES
dc.type.version	info:eu-repo/semantics/publishedVersion	es_ES
dc.description.volume	33	es_ES
dc.relation.pasarela	S\390054	es_ES
dc.contributor.funder	Universitat Jaume I	es_ES
dc.contributor.funder	European Regional Development Fund	es_ES
dc.contributor.funder	Ministerio de Economía y Competitividad	es_ES
dc.contributor.funder	Ministerio de Educación, Cultura y Deporte	es_ES
dc.contributor.funder	Comisión Interministerial de Ciencia y Tecnología	es_ES
dc.contributor.funder	Agencia Estatal de Investigación	es_ES
dc.description.references	Hackbusch, W. (1999). A Sparse Matrix Arithmetic Based on $\Cal H$ -Matrices. Part I: Introduction to ${\Cal H}$ -Matrices. Computing, 62(2), 89-108. doi:10.1007/s006070050015	es_ES
dc.description.references	Grasedyck, L., & Hackbusch, W. (2003). Construction and Arithmetics of H -Matrices. Computing, 70(4), 295-334. doi:10.1007/s00607-003-0019-1	es_ES
dc.description.references	Dongarra, J. J., Du Croz, J., Hammarling, S., & Duff, I. S. (1990). A set of level 3 basic linear algebra subprograms. ACM Transactions on Mathematical Software, 16(1), 1-17. doi:10.1145/77626.79170	es_ES
dc.description.references	Buttari, A., Langou, J., Kurzak, J., & Dongarra, J. (2009). A class of parallel tiled linear algebra algorithms for multicore architectures. Parallel Computing, 35(1), 38-53. doi:10.1016/j.parco.2008.10.002	es_ES
dc.description.references	Quintana-Ortí, G., Quintana-Ortí, E. S., Geijn, R. A. V. D., Zee, F. G. V., & Chan, E. (2009). Programming matrix algorithms-by-blocks for thread-level parallelism. ACM Transactions on Mathematical Software, 36(3), 1-26. doi:10.1145/1527286.1527288	es_ES
dc.description.references	Badia, R. M., Herrero, J. R., Labarta, J., Pérez, J. M., Quintana-Ortí, E. S., & Quintana-Ortí, G. (2009). Parallelizing dense and banded linear algebra libraries using SMPSs. Concurrency and Computation: Practice and Experience, 21(18), 2438-2456. doi:10.1002/cpe.1463	es_ES
dc.description.references	Aliaga, J. I., Badia, R. M., Barreda, M., Bollhofer, M., & Quintana-Orti, E. S. (2014). Leveraging Task-Parallelism with OmpSs in ILUPACK’s Preconditioned CG Method. 2014 IEEE 26th International Symposium on Computer Architecture and High Performance Computing. doi:10.1109/sbac-pad.2014.24	es_ES
dc.description.references	Agullo, E., Buttari, A., Guermouche, A., & Lopez, F. (2016). Implementing Multifrontal Sparse Solvers for Multicore Architectures with Sequential Task Flow Runtime Systems. ACM Transactions on Mathematical Software, 43(2), 1-22. doi:10.1145/2898348	es_ES
dc.description.references	Aliaga, J. I., Carratala-Saez, R., Kriemann, R., & Quintana-Orti, E. S. (2017). Task-Parallel LU Factorization of Hierarchical Matrices Using OmpSs. 2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW). doi:10.1109/ipdpsw.2017.124	es_ES
dc.description.references	The OpenMP API specification for parallel programming, http://www.openmp.org/.	es_ES
dc.description.references	OmpSs project home page, http://pm.bsc.es/ompss.	es_ES
dc.description.references	Perez, J. M., Beltran, V., Labarta, J., & Ayguade, E. (2017). Improving the Integration of Task Nesting and Dependencies in OpenMP. 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS). doi:10.1109/ipdps.2017.69	es_ES
dc.description.references	HLIBpro library home page, https://www.hlibpro.com/.	es_ES
dc.description.references	Bempp library home page, https://bempp.com/.	es_ES
dc.description.references	HACApK library github repository, https://github.com/hoshino-UTokyo/hacapk-gpu.	es_ES
dc.description.references	hmglib library github repository, https://github.com/zaspel/hmglib.	es_ES
dc.description.references	HiCMA library github repository, https://github.com/ecrc/hicma.	es_ES
dc.description.references	Hackbusch, W., & Börm, S. (2002). -matrix approximation of integral operators by interpolation. Applied Numerical Mathematics, 43(1-2), 129-143. doi:10.1016/s0168-9274(02)00121-6	es_ES

Este ítem aparece en la(s) siguiente(s) colección(ones)

Artículos, conferencias, monografías [48344]

Mostrar el registro sencillo del ítem

Exploiting nested task-parallelism in the H-LU factorization

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Buscar en RiuNet

Listar

Todo RiuNet

Esta colección

Mi cuenta

Estadísticas

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Exploiting nested task-parallelism in the H-LU factorization

Ficheros en el ítem

Este ítem aparece en la(s) siguiente(s) colección(ones)