- -

Using Ginkgo's memory accessor for improving the accuracy of memory-bound low precision BLAS

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

Using Ginkgo's memory accessor for improving the accuracy of memory-bound low precision BLAS

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author Grützmacher, Thomas es_ES
dc.contributor.author Anzt, Hartwig es_ES
dc.contributor.author Quintana-Ortí, Enrique S. es_ES
dc.date.accessioned 2023-07-21T18:06:10Z
dc.date.available 2023-07-21T18:06:10Z
dc.date.issued 2023-01 es_ES
dc.identifier.issn 0038-0644 es_ES
dc.identifier.uri http://hdl.handle.net/10251/195332
dc.description.abstract [EN] The roofline model not only provides a powerful tool to relate an application's performance with the specific constraints imposed by the target hardware but also offers a graphic representation of the balance between memory access cost and compute throughput. In this work, we present a strategy to break up the tight coupling between the precision format used for arithmetic operations and the storage format employed for memory operations. (At a high level, this idea is equivalent to compressing/decompressing the data in registers before/after invoking store/load memory operations.) In practice, we demonstrate that a "memory accessor" that hides the data compression behind the memory access, can virtually push the bandwidth-induced roofline, yielding higher performance for memory-bound applications using high precision arithmetic that can handle the numerical effects associated with lossy compression. We also demonstrate that memory-bound applications operating on low precision data can increase the accuracy by relying on the memory accessor to perform all arithmetic operations in high precision. In particular, we demonstrate that memory-bound BLAS operations (including the sparse matrix-vector product) can be re-engineered with the memory accessor and that the resulting accessor-enabled BLAS routines achieve lower rounding errors while delivering the same performance as the fast low precision BLAS. es_ES
dc.description.sponsorship Helmholtz-Gemeinschaft, Grant/Award Number: VH-NG-1241; US Exascale Computing Project, Grant/Award Number: 17-SC-20-SC es_ES
dc.language Inglés es_ES
dc.publisher John Wiley & Sons es_ES
dc.relation.ispartof Software Practice and Experience es_ES
dc.rights Reconocimiento (by) es_ES
dc.subject Accessor es_ES
dc.subject Floating-point formats es_ES
dc.subject High performance es_ES
dc.subject Memory-bound algorithms es_ES
dc.subject Mixed precision es_ES
dc.subject Roofline model es_ES
dc.subject.classification ARQUITECTURA Y TECNOLOGIA DE COMPUTADORES es_ES
dc.title Using Ginkgo's memory accessor for improving the accuracy of memory-bound low precision BLAS es_ES
dc.type Artículo es_ES
dc.identifier.doi 10.1002/spe.3041 es_ES
dc.relation.projectID info:eu-repo/grantAgreement/DOE//17-SC-20-SC//Exascale Computing Project/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/Helmholtz Association of German Research Centers//VH-NG-1241/ es_ES
dc.rights.accessRights Abierto es_ES
dc.contributor.affiliation Universitat Politècnica de València. Escola Tècnica Superior d'Enginyeria Informàtica es_ES
dc.description.bibliographicCitation Grützmacher, T.; Anzt, H.; Quintana-Ortí, ES. (2023). Using Ginkgo's memory accessor for improving the accuracy of memory-bound low precision BLAS. Software Practice and Experience. 53(1):81-98. https://doi.org/10.1002/spe.3041 es_ES
dc.description.accrualMethod S es_ES
dc.relation.publisherversion https://doi.org/10.1002/spe.3041 es_ES
dc.description.upvformatpinicio 81 es_ES
dc.description.upvformatpfin 98 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.description.volume 53 es_ES
dc.description.issue 1 es_ES
dc.relation.pasarela S\479300 es_ES
dc.contributor.funder U.S. Department of Energy es_ES
dc.contributor.funder Helmholtz Association of German Research Centers es_ES


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem