Mostrar el registro sencillo del ítem
dc.contributor.author | Aliaga, José I. | es_ES |
dc.contributor.author | Anzt, Hartwig | es_ES |
dc.contributor.author | Grützmacher, Thomas | es_ES |
dc.contributor.author | Quintana-Ortí, Enrique S. | es_ES |
dc.contributor.author | Tomás Domínguez, Andrés Enrique | es_ES |
dc.date.accessioned | 2023-09-12T18:04:41Z | |
dc.date.available | 2023-09-12T18:04:41Z | |
dc.date.issued | 2022-06-25 | es_ES |
dc.identifier.issn | 1532-0626 | es_ES |
dc.identifier.uri | http://hdl.handle.net/10251/196280 | |
dc.description.abstract | [EN] We contribute to the optimization of the sparse matrix-vector product by introducing a variant of the coordinate sparse matrix format that balances the workload distribution and compresses both the indexing arrays and the numerical information. Our approach is multi-platform, in the sense that the realizations for (general-purpose) multicore processors as well as graphics accelerators (GPUs) are built upon common principles, but differ in the implementation details, which are adapted to avoid thread divergence in the GPU case or maximize compression element-wise (i.e., for each matrix entry) for multicore architectures. Our evaluation on the two last generations of NVIDIA GPUs as well as Intel and AMD processors demonstrate the benefits of the new kernels when compared with the optimized implementations of the sparse matrix-vector product in NVIDIA's cuSPARSE and Intel's MKL, respectively. | es_ES |
dc.description.sponsorship | J. I. Aliaga, E. S. Quintana-Ortí, and A. E. Tomás were supported by TIN2017-82972-R of the Spanish MINECO. H. Anzt and T. Grützmacher were supported by the Impuls und Vernetzungsfond of the Helmholtz Association under grant VH-NG-1241 and by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration. The authors would like to thank the Steinbuch Centre for Computing (SCC) of the Karlsruhe Institute of Technology for providing access to an NVIDIA A100 GPU. | es_ES |
dc.language | Inglés | es_ES |
dc.publisher | John Wiley & Sons | es_ES |
dc.relation.ispartof | Concurrency and Computation: Practice and Experience | es_ES |
dc.rights | Reserva de todos los derechos | es_ES |
dc.subject | Compression | es_ES |
dc.subject | Coordinate sparse matrix format | es_ES |
dc.subject | Graphics processing units (GPUs) | es_ES |
dc.subject | Multicore processors (CPUs) | es_ES |
dc.subject | Sparse matrix-vector product | es_ES |
dc.subject | Workload balancing. | es_ES |
dc.subject.classification | ARQUITECTURA Y TECNOLOGIA DE COMPUTADORES | es_ES |
dc.title | Compression and load balancing for efficient sparse matrix-vector product on multicore processors and graphics processing units | es_ES |
dc.type | Artículo | es_ES |
dc.identifier.doi | 10.1002/cpe.6515 | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2013-2016/TIN2017-82972-R/ES/TECNICAS ALGORITMICAS PARA COMPUTACION DE ALTO RENDIMIENTO CONSCIENTE DEL CONSUMO ENERGETICO Y RESISTENTE A ERRORES/ | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/DOE//17-SC-20-SC//Exascale Computing Project/ | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/Helmholtz Association of German Research Centers//VH-NG-1241/ | es_ES |
dc.rights.accessRights | Abierto | es_ES |
dc.contributor.affiliation | Universitat Politècnica de València. Departamento de Informática de Sistemas y Computadores - Departament d'Informàtica de Sistemes i Computadors | es_ES |
dc.contributor.affiliation | Universitat Politècnica de València. Escola Tècnica Superior d'Enginyeria Informàtica | es_ES |
dc.description.bibliographicCitation | Aliaga, JI.; Anzt, H.; Grützmacher, T.; Quintana-Ortí, ES.; Tomás Domínguez, AE. (2022). Compression and load balancing for efficient sparse matrix-vector product on multicore processors and graphics processing units. Concurrency and Computation: Practice and Experience. 34(14):1-13. https://doi.org/10.1002/cpe.6515 | es_ES |
dc.description.accrualMethod | S | es_ES |
dc.relation.publisherversion | https://doi.org/10.1002/cpe.6515 | es_ES |
dc.description.upvformatpinicio | 1 | es_ES |
dc.description.upvformatpfin | 13 | es_ES |
dc.type.version | info:eu-repo/semantics/publishedVersion | es_ES |
dc.description.volume | 34 | es_ES |
dc.description.issue | 14 | es_ES |
dc.relation.pasarela | S\465928 | es_ES |
dc.contributor.funder | Nvidia | es_ES |
dc.contributor.funder | U.S. Department of Energy | es_ES |
dc.contributor.funder | Agencia Estatal de Investigación | es_ES |
dc.contributor.funder | Helmholtz Association of German Research Centers | es_ES |