Multi-threaded dense linear algebra libraries for low-power asymmetric multicore processors
Fecha
Autores
Catalán, Sandra
Herrero, José R.
Igual Peña, Francisco Daniel
Rodríguez-Sánchez, Rafael
Adeniyi-Jones, Chris
Directores
Handle
https://riunet.upv.es/handle/10251/147617
Cita bibliográfica
Catalán, S.; Herrero, JR.; Igual Peña, FD.; Rodríguez-Sánchez, R.; Quintana Ortí, ES.; Adeniyi-Jones, C. (2018). Multi-threaded dense linear algebra libraries for low-power asymmetric multicore processors. Journal of Computational Science. 25:140-151. https://doi.org/10.1016/j.jocs.2016.10.020
Titulación
Resumen
[EN] Dense linear algebra libraries, such as BLAS and LAPACK, provide a relevant collection of numerical tools for many scientific and engineering applications. While there exist high performance implementations of the BLAS (and LAPACK) functionality for many current multi-threaded architectures, the adaption of these libraries for asymmetric multicore processors (AMPs) is still pending. In this paper we address this challenge by developing an asymmetry-aware implementation of the BLAS, based on the BLIS framework, and tailored for AMPs equipped with two types of cores: fast/power-hungry versus slow/energy-efficient. For this purpose, we integrate coarse-grain and fine-grain parallelization strategies into the library routines which, respectively, dynamically distribute the workload between the two core types and statically repartition this work among the cores of the same type.
Our results on an ARM (R) big.LITTLE (TM) processor embedded in the Exynos 5422 SoC, using the asymmetry-aware version of the BLAS and a plain migration of the legacy version of LAPACK, experimentally assess the benefits, limitations, and potential of this approach from the perspectives of both throughput and energy efficiency. (C) 2016 Elsevier B.V. All rights reserved.
Palabras clave
Dense linear algebra, BLAS, LAPACK, Asymmetric multicore processors, Multi-threading, High performance computing
ISSN
1877-7503
ISBN
Fuente
Journal of Computational Science
DOI
10.1016/j.jocs.2016.10.020
Enlaces relacionados
Código de Proyecto
info:eu-repo/grantAgreement/MINECO//TIN2014-53495-R/ES/COMPUTACION HETEROGENEA DE BAJO CONSUMO/
info:eu-repo/grantAgreement/MINECO//TIN2015-65316-P/ES/COMPUTACION DE ALTAS PRESTACIONES VII/
info:eu-repo/grantAgreement/Generalitat de Catalunya//2014 SGR 1051/
info:eu-repo/grantAgreement/MICINN//TIN2011-23283/ES/POWER-AWARE HIGH PERFORMANCE COMPUTING/
info:eu-repo/grantAgreement/MINECO//TIN2015-65277-R/ES/COMPPUTACION HETEROGENEA EFICIENTE: DEL PROCESADOR AL DATACENTER/
info:eu-repo/grantAgreement/MINECO//TIN2015-65316-P/ES/COMPUTACION DE ALTAS PRESTACIONES VII/
info:eu-repo/grantAgreement/Generalitat de Catalunya//2014 SGR 1051/
info:eu-repo/grantAgreement/MICINN//TIN2011-23283/ES/POWER-AWARE HIGH PERFORMANCE COMPUTING/
info:eu-repo/grantAgreement/MINECO//TIN2015-65277-R/ES/COMPPUTACION HETEROGENEA EFICIENTE: DEL PROCESADOR AL DATACENTER/
Agradecimientos
The researchers from Universidad Jaume I were supported by projects CICYT TIN2011-23283 and TIN2014-53495-R of MINECO and FEDER, and the FPU program of MECD. The researcher from Universidad Complutense de Madrid was supported by project CICYT TIN2015-65277-R. The researcher from Universitat Politecnica de Catalunya was supported by projects TIN2015-65316-P from the Spanish Ministry of Education and 2014 SGR 1051 from the Generalitat de Catalunya, Dep. dinnovacio, Universitats i Empresa.