- -

Enhancing performance and energy consumption of runtime schedulers for dense linear algebra

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

Enhancing performance and energy consumption of runtime schedulers for dense linear algebra

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author Alonso-Jordá, Pedro es_ES
dc.contributor.author Dolz Zaragozá, Manuel Francisco es_ES
dc.contributor.author Igual, Francisco D. es_ES
dc.contributor.author Mayo, Rafael es_ES
dc.contributor.author Quintana Ortí, Enrique Salvador es_ES
dc.date.accessioned 2016-04-14T08:14:43Z
dc.date.available 2016-04-14T08:14:43Z
dc.date.issued 2014-10
dc.identifier.issn 1532-0626
dc.identifier.uri http://hdl.handle.net/10251/62541
dc.description.abstract The road towards Exascale Computing requires a holistic effort to address three different challenges simultaneously: high performance, energy efficiency, and programmability. The use of runtime task schedulers to orchestrate parallel executions with minimal developer intervention has been introduced in recent years to tackle the programmability issue while maintaining, or even improving, performance. In this paper, we enhance the SuperMatrix runtime task scheduler integrated in the libflame library in two different directions that address high performance and energy efficiency. First, we extend the runtime by accom- modating hybrid parallel executions and managing task priorities for dense linear algebra operations, with remarkable performance improvements. Second, we introduce techniques to reduce energy consumption during idle times inherent to parallel executions, attaining important energy savings. In addition, we propose a power consumption model that can be leveraged by runtime task schedulers to make decisions based not only on performance but also on energy considerations. es_ES
dc.description.sponsorship This research was supported by project CICYT TIN2011-23283 and FEDER, and by the EU-FET grant 'EXA2GREEN' 318793. Francisco D. Igual was supported by project TIN2012-32180. en_EN
dc.language Inglés es_ES
dc.publisher Wiley es_ES
dc.relation.ispartof Concurrency and Computation: Practice and Experience es_ES
dc.rights Reserva de todos los derechos es_ES
dc.subject Runtime schedulers es_ES
dc.subject Energy-aware computing es_ES
dc.subject Hybrid architectures es_ES
dc.subject Dense linear algebra es_ES
dc.subject.classification CIENCIAS DE LA COMPUTACION E INTELIGENCIA ARTIFICIAL es_ES
dc.subject.classification LENGUAJES Y SISTEMAS INFORMATICOS es_ES
dc.title Enhancing performance and energy consumption of runtime schedulers for dense linear algebra es_ES
dc.type Artículo es_ES
dc.identifier.doi 10.1002/cpe.3317
dc.relation.projectID info:eu-repo/grantAgreement/MICINN//TIN2011-23283/ES/POWER-AWARE HIGH PERFORMANCE COMPUTING/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/EC/FP7/318793/EU/Energy-Aware Sustainable Computing on Future Technology – Paving the Road to Exascale Computing/
dc.relation.projectID info:eu-repo/grantAgreement/MINECO//TIN2012-32180/ES/ARQUITECTURAS Y TECNOLOGIAS EMERGENTES. EFICIENCIA ENERGETICA MEDIANTE HETEROGENEIDAD/ es_ES
dc.rights.accessRights Cerrado es_ES
dc.contributor.affiliation Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació es_ES
dc.description.bibliographicCitation Alonso-Jordá, P.; Dolz Zaragozá, MF.; Igual, FD.; Mayo, R.; Quintana Ortí, ES. (2014). Enhancing performance and energy consumption of runtime schedulers for dense linear algebra. Concurrency and Computation: Practice and Experience. 26(15):2591-2611. https://doi.org/10.1002/cpe.3317 es_ES
dc.description.accrualMethod S es_ES
dc.relation.publisherversion http://dx.doi.org/10.1002/cpe.3317 es_ES
dc.description.upvformatpinicio 2591 es_ES
dc.description.upvformatpfin 2611 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.description.volume 26 es_ES
dc.description.issue 15 es_ES
dc.relation.senia 267875 es_ES
dc.contributor.funder Ministerio de Economía y Competitividad es_ES
dc.contributor.funder Ministerio de Ciencia e Innovación es_ES
dc.contributor.funder European Regional Development Fund es_ES
dc.description.references Project home page for OpenCL - the open standard for parallel programming of heterogeneous systems. project home page http://www.khronos.org/opencl/ es_ES
dc.description.references The Green500 list 2010 http://www.green500.org es_ES
dc.description.references The top500 list 2010 http://www.top500.org es_ES
dc.description.references OmpSs project home page http://pm.bsc.es/ompss/ es_ES
dc.description.references StarPU project home page http://runtime.bordeaux.inria.fr/StarPU/ es_ES
dc.description.references Mentat project http://www.cs.virginia.edu/~mentat/ es_ES
dc.description.references Harmony project home page http://code.google.com/p/harmonyruntime/ es_ES
dc.description.references Cilk project http://supertech.csail.mit.edu/cilk/ es_ES
dc.description.references Badia, R. M., Herrero, J. R., Labarta, J., Pérez, J. M., Quintana-Ortí, E. S., & Quintana-Ortí, G. (2009). Parallelizing dense and banded linear algebra libraries using SMPSs. Concurrency and Computation: Practice and Experience, 21(18), 2438-2456. doi:10.1002/cpe.1463 es_ES
dc.description.references PLASMA project home page http://icl.cs.utk.edu/plasma/ es_ES
dc.description.references FLAME project home page http://www.cs.utexas.edu/users/flame/ es_ES
dc.description.references Borkar, S., & Chien, A. A. (2011). The future of microprocessors. Communications of the ACM, 54(5), 67. doi:10.1145/1941487.1941507 es_ES
dc.description.references Esmaeilzadeh H Blem E St. Amant R Sankaralingam K Burger D Dark silicon and the end of multicore scaling Proceedings 38th Annual International Symposium Computer Architecture 2011 365 376 es_ES
dc.description.references Duranton M et al The HiPEAC vision for advanced computing in horizon 2020 2013 http://www.hipeac.net/roadmap es_ES
dc.description.references Zee FGV libflame . the complete reference 2008 http://www.cs.utexas.edu/users/flame es_ES
dc.description.references Quintana-Ortí, G., Quintana-Ortí, E. S., Geijn, R. A. V. D., Zee, F. G. V., & Chan, E. (2009). Programming matrix algorithms-by-blocks for thread-level parallelism. ACM Transactions on Mathematical Software, 36(3), 1-26. doi:10.1145/1527286.1527288 es_ES
dc.description.references Quintana-Ortí G Igual FD Quintana-Ortí ES van de Geijn R Solving dense linear algebra problems on platforms with multiple hardware accelerators Ppopp '09: The 14th ACM Sigplan Symposium on Principles and Practice of Parallel Programming 2009 121 129 es_ES
dc.description.references Alonso, P., Dolz, M. F., Igual, F. D., Quintana-Ortí, E. S., & Mayo, R. (2013). Runtime Scheduling of the LU Factorization: Performance and Energy. Lecture Notes in Computer Science, 153-167. doi:10.1007/978-3-642-40517-4_14 es_ES
dc.description.references Bientinesi, P., Gunnels, J. A., Myers, M. E., Quintana-Ortí, E. S., & Geijn, R. A. van de. (2005). The science of deriving dense linear algebra algorithms. ACM Transactions on Mathematical Software, 31(1), 1-26. doi:10.1145/1055531.1055532 es_ES
dc.description.references Alonso P Badia RM Labarta J Barreda M Dolz MF Mayo R Quintana-Ortí ES Reyes R Tools for power-energy modelling and analysis of parallel scientific applications 2012 420 429 es_ES
dc.description.references Barrachina, S., Castillo, M., Igual, F. D., Mayo, R., Quintana-Ortí, E. S., & Quintana-Ortí, G. (2009). Exploiting the capabilities of modern GPUs for dense matrix computations. Concurrency and Computation: Practice and Experience, 21(18), 2457-2477. doi:10.1002/cpe.1472 es_ES
dc.description.references Chan E van de Geijn R Chapman A Managing the complexity of lookahead for lu factorization with pivoting Proceedings of the 22nd ACM Symposium on Parallelism in Algorithms and Architectures 2010 200 208 http://doi.acm.org/10.1145/1810479.1810520 es_ES
dc.description.references Igual, F. D., Chan, E., Quintana-Ortí, E. S., Quintana-Ortí, G., van de Geijn, R. A., & Van Zee, F. G. (2012). The FLAME approach: From dense linear algebra algorithms to high-performance multi-accelerator implementations. Journal of Parallel and Distributed Computing, 72(9), 1134-1143. doi:10.1016/j.jpdc.2011.10.014 es_ES
dc.description.references Perez, J. M., Bellens, P., Badia, R. M., & Labarta, J. (2007). CellSs: Making it easier to program the Cell Broadband Engine processor. IBM Journal of Research and Development, 51(5), 593-604. doi:10.1147/rd.515.0593 es_ES
dc.description.references Paraver project http://www.cepba.upc.es/paraver es_ES
dc.description.references Alonso, P., Dolz, M. F., Mayo, R., & Quintana-Ortí, E. S. (2012). Modeling power and energy of the task-parallel Cholesky factorization on multicore processors. Computer Science - Research and Development, 29(2), 105-112. doi:10.1007/s00450-012-0227-z es_ES
dc.description.references Elnozahy, E. N., Kistler, M., & Rajamony, R. (2003). Energy-Efficient Server Clusters. Lecture Notes in Computer Science, 179-197. doi:10.1007/3-540-36612-1_12 es_ES
dc.description.references AnandTech Forums Power-consumption scaling with clockspeed and Vcc for the i7-2600K 2011 http://forums.anandtech.com/showthread.php?t=2195927 es_ES


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem