- -

Programming parallel dense matrix factorizations and inversion for new-generation NUMA architectures

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

Programming parallel dense matrix factorizations and inversion for new-generation NUMA architectures

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author Catalán, Sandra es_ES
dc.contributor.author Igual, Francisco D. es_ES
dc.contributor.author Herrero, José R. es_ES
dc.contributor.author Rodríguez-Sánchez, Rafael es_ES
dc.contributor.author Quintana-Ortí, Enrique S. es_ES
dc.date.accessioned 2024-05-17T18:06:21Z
dc.date.available 2024-05-17T18:06:21Z
dc.date.issued 2023-05 es_ES
dc.identifier.issn 0743-7315 es_ES
dc.identifier.uri http://hdl.handle.net/10251/204242
dc.description.abstract [EN] We propose a methodology to address the programmability issues derived from the emergence of newgeneration shared-memory NUMA architectures. For this purpose, we employ dense matrix factorizations and matrix inversion (DMFI) as a use case, and we target two modern architectures (AMD Rome and Huawei Kunpeng 920) that exhibit configurable NUMA topologies. Our methodology pursues performance portability across different NUMA configurations by proposing multi-domain implementations for DMFI plus a hybrid task- and loop-level parallelization that configures multi-threaded executions to fix core-todata binding, exploiting locality at the expense of minor code modifications. In addition, we introduce a generalization of the multi-domain implementations for DMFI that offers support for virtually any NUMA topology in present and future architectures. Our experimentation on the two target architectures for three representative dense linear algebra operations validates the proposal, reveals insights on the necessity of adapting both the codes and their execution to improve data access locality, and reports performance across architectures and inter- and intra-socket NUMA configurations competitive with state-of-the-art message-passing implementations, maintaining the ease of development usually associated with shared-memory programming. es_ES
dc.description.sponsorship This research was sponsored by project PID2019-107255GB of Ministerio de Ciencia, Innovacion y Universidades; project S2018/TCS-4423 of Comunidad de Madrid; project 2017-SGR-1414 of the Generalitat de Catalunya and the Madrid Government under the Multiannual Agreement with UCM in the line Pro-gram to Stimulate Research for Young Doctors in the context of the V PRICIT, project PR65/19-22445. This project has also re-ceived funding from the European High-Performance Computing Joint Undertaking (JU) under grant agreement No 955558. The JU receives support from the European Union's Horizon 2020 research and innovation programme, and Spain, Germany, France, Italy, Poland, Switzerland, Norway. The work is also supported by grants PID2020-113656RB-C22 and PID2021-126576NB-I00 of MCIN/AEI/10.13039/501100011033 and by ERDF A way of making Europe. es_ES
dc.language Inglés es_ES
dc.publisher Elsevier es_ES
dc.relation.ispartof Journal of Parallel and Distributed Computing es_ES
dc.rights Reconocimiento - No comercial - Sin obra derivada (by-nc-nd) es_ES
dc.subject NUMA architectures es_ES
dc.subject Chiplets es_ES
dc.subject Dense linear algebra es_ES
dc.subject Shared memory programming es_ES
dc.subject Portability es_ES
dc.subject.classification ARQUITECTURA Y TECNOLOGIA DE COMPUTADORES es_ES
dc.title Programming parallel dense matrix factorizations and inversion for new-generation NUMA architectures es_ES
dc.type Artículo es_ES
dc.identifier.doi 10.1016/j.jpdc.2023.01.004 es_ES
dc.relation.projectID info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2019-107255GB-C21/ES/BSC - COMPUTACION DE ALTAS PRESTACIONES VIII/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2020-113656RB-C22/ES/COMPUTACION Y COMUNICACIONES DE ALTAS PRESTACIONES CONSCIENTES DEL CONSUMO ENERGETICO. APLICACIONES AL APRENDIZAJE PROFUNDO COMPUTACIONAL - UPV/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/PID2021-126576NB-I00/ES/SOFTWARE DE SISTEMA PARA ARQUITECTURAS Y APLICACIONES DE NUEVA GENERACION/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/955558/EU es_ES
dc.relation.projectID info:eu-repo/grantAgreement/CAM//S2018%2FTCS-4423 / es_ES
dc.relation.projectID info:eu-repo/grantAgreement/CAM//PR65%2F19-22445/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/GC//2017-SGR-1414/ es_ES
dc.rights.accessRights Abierto es_ES
dc.contributor.affiliation Universitat Politècnica de València. Escola Tècnica Superior d'Enginyeria Informàtica es_ES
dc.description.bibliographicCitation Catalán, S.; Igual, FD.; Herrero, JR.; Rodríguez-Sánchez, R.; Quintana-Ortí, ES. (2023). Programming parallel dense matrix factorizations and inversion for new-generation NUMA architectures. Journal of Parallel and Distributed Computing. 175:51-65. https://doi.org/10.1016/j.jpdc.2023.01.004 es_ES
dc.description.accrualMethod S es_ES
dc.relation.publisherversion https://doi.org/10.1016/j.jpdc.2023.01.004 es_ES
dc.description.upvformatpinicio 51 es_ES
dc.description.upvformatpfin 65 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.description.volume 175 es_ES
dc.relation.pasarela S\485750 es_ES
dc.contributor.funder Comunidad de Madrid es_ES
dc.contributor.funder European Commission es_ES
dc.contributor.funder Generalitat de Catalunya es_ES
dc.contributor.funder Agencia Estatal de Investigación es_ES
dc.contributor.funder European Regional Development Fund es_ES
dc.contributor.funder Universitat Politècnica de València es_ES


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem