Mostrar el registro sencillo del ítem
dc.contributor.author | Boratto, Murilo | es_ES |
dc.contributor.author | Alonso-Jordá, Pedro | es_ES |
dc.contributor.author | Gimenez, Domingo | es_ES |
dc.contributor.author | Lastovetsky, Alexey | es_ES |
dc.date.accessioned | 2020-10-17T03:32:33Z | |
dc.date.available | 2020-10-17T03:32:33Z | |
dc.date.issued | 2017-01 | es_ES |
dc.identifier.issn | 0920-8542 | es_ES |
dc.identifier.uri | http://hdl.handle.net/10251/152271 | |
dc.description.abstract | [EN] Automatic tuning methodologies have been used in the design of routines in recent years. The goal of these methodologies is to develop routines which automatically adapt to the conditions of the underlying computational system so that efficient executions are obtained independently of the end- user experience. This paper aims to explore programming routines that can automatically be adapted to the computational system conditions thanks to these automatic tuning methodologies. In particular, we have worked on the evaluation of matrix polynomials on multicore and multi-GPU systems as a target application. This application is very useful for the computation of matrix functions like the sine or cosine but, at the same time, the application is very time consuming since the basic computational kernel, which is the matrix multiplication, is carried out many times. The use of all available resources within a node in an easy and efficient way is crucial for the end user. | es_ES |
dc.description.sponsorship | This work has been partially supported by Generalitat Valenciana under Grant PROM-ETEOII/2014/003, and by the Spanish MINECO, as well as European Commission FEDER funds, under Grant TEC2015-67387-C4-1-R and TIN2015-66972-C5-3-R, and network CAPAP-H. Also, we have work in cooperation with the EU-COST Programme Action IC1305, "Network for Sustainable Ultrascale Computing (NESUS)". | es_ES |
dc.language | Inglés | es_ES |
dc.publisher | Springer-Verlag | es_ES |
dc.relation.ispartof | The Journal of Supercomputing | es_ES |
dc.rights | Reserva de todos los derechos | es_ES |
dc.subject | Automatic Tuning | es_ES |
dc.subject | Matrix Polynomials | es_ES |
dc.subject | Performance | es_ES |
dc.subject | Multicore | es_ES |
dc.subject | Multi-GPU | es_ES |
dc.subject.classification | CIENCIAS DE LA COMPUTACION E INTELIGENCIA ARTIFICIAL | es_ES |
dc.title | Automatic Tuning to Performance Modelling of Matrix Polynomials on Multicore and Multi-GPU Systems | es_ES |
dc.type | Artículo | es_ES |
dc.identifier.doi | 10.1007/s11227-016-1694-y | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/COST//IC1305/EU/Network for Sustainable Ultrascale Computing (NESUS)/ | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/MINECO//TIN2015-66972-C5-3-R/ES/TECNICAS PARA LA MEJORA DE LAS PRESTACIONES, FIABILIDAD Y CONSUMO DE ENERGIA DE LOS SERVIDORES. OPTIMIZACION DE APLICACIONES CIENTIFICAS, MEDICAS Y DE VISION ARTIFICIAL/ | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/MINECO//TEC2015-67387-C4-1-R/ES/SMART SOUND PROCESSING FOR THE DIGITAL LIVING/ | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/GVA//PROMETEOII%2F2014%2F003/ES/Computación y comunicaciones de altas prestaciones y aplicaciones en ingeniería/ | es_ES |
dc.rights.accessRights | Abierto | es_ES |
dc.contributor.affiliation | Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació | es_ES |
dc.description.bibliographicCitation | Boratto, M.; Alonso-Jordá, P.; Gimenez, D.; Lastovetsky, A. (2017). Automatic Tuning to Performance Modelling of Matrix Polynomials on Multicore and Multi-GPU Systems. The Journal of Supercomputing. 73(1):227-239. https://doi.org/10.1007/s11227-016-1694-y | es_ES |
dc.description.accrualMethod | S | es_ES |
dc.relation.publisherversion | https://doi.org/10.1007/s11227-016-1694-y | es_ES |
dc.description.upvformatpinicio | 227 | es_ES |
dc.description.upvformatpfin | 239 | es_ES |
dc.type.version | info:eu-repo/semantics/publishedVersion | es_ES |
dc.description.volume | 73 | es_ES |
dc.description.issue | 1 | es_ES |
dc.relation.pasarela | S\302771 | es_ES |
dc.contributor.funder | Generalitat Valenciana | es_ES |
dc.contributor.funder | European Regional Development Fund | es_ES |
dc.contributor.funder | Ministerio de Economía y Competitividad | es_ES |
dc.contributor.funder | European Cooperation in Science and Technology | es_ES |
dc.description.references | Alberti PV, Alonso P, Vidal AM, Cuenca J, Giménez D (2004) Designing polylibraries to speed up linear algebra computations. IJHPCN 1(1/2/3):75–84 | es_ES |
dc.description.references | Alonso P, Boratto M, Pinilla J, Ibañez J, Martinez J (2014) On the evaluation of matrix polynomials using several GPGPUs. Tech Rep Riunet/E10251/39615 | es_ES |
dc.description.references | Anderson E, Bai Z, Bischof C, Demmel J, Dongarra J, Croz JD, Greenbaum A, Hammarling S, McKenney A, Ostrouchov S, Sorensen D (2013) LAPACK users guide, 2nd edn. SIAM, Philadelphia | es_ES |
dc.description.references | Blackford LS, Demmel J, Dongarra J, Duff I, Hammarling S, Henry G, Heroux M, Kaufman L, Lumsdaine A, Petitet A, Pozo R, Remington K, Whaley RC (2001) An updated set of basic linear algebra subprograms (blas). ACM Trans Math Softw 28:135–151 | es_ES |
dc.description.references | Caron E, Uter F (2002) Parallel extension of a dynamic performance forecasting tool. Sci Ann Cuza Univ 11:80–93 | es_ES |
dc.description.references | Chandra R (2001) Parallel programming in OpenMP. Morgan Kaufmann, Burlington | es_ES |
dc.description.references | Demmel J, Marques O, Parlett BN, Vömel C (2008) Performance and accuracy of LAPACK’s symmetric tridiagonal eigensolvers. SIAM J.Sci Comput 30(3):1508–1526 | es_ES |
dc.description.references | Frigo M, Johnson S (1998) FFTW: an adaptive software architecture for the FFT. In: Proceedings of IEEE International Conference on Acoustics Speech and Signal Processing vol. 3, pp 1381–1384 | es_ES |
dc.description.references | García L, Cuenca J, Giménez D (2007) Including improvement of the execution time in a software architecture of libraries with self-optimisation. In: ICSOFT 2007, Proceedings of the Second International Conference on Software and Data Technologies, Volume SE, Barcelona, Spain, pp 156–161, 22–25 July | es_ES |
dc.description.references | García LP, Cuenca J, Giménez D (2014) On optimization techniques for the matrix multiplication on hybrid cpu+gpu platforms. Ann Multicore GPU Program 1(1):10–18 | es_ES |
dc.description.references | Hasanov K, Quintin JN, Lastovetsky A (2014) Hierarchical approach to optimization of parallel matrix multiplication on large-scale platforms. J Supercomput 71(11):24–34 | es_ES |
dc.description.references | Katagiri T, Kise K, Honda H (2005) RAO-SS: a prototype of run-time auto-tuning facility for sparse direct solvers. Tech Rep 22(1):1–10 | es_ES |
dc.description.references | Katagiri T, Kise K, Honda H, Yuba T (2004) Effect of auto-tuning with user’s knowledge for numerical software. Proceedings of the 1st conference on computing frontiers, Ischia, Italy. ACM, New York, NY, USA, pp 12–25 | es_ES |
dc.description.references | Nath R, Tomov S, Dongarra J (2010) An improved magma gemm for fermi graphics processing units. Int J High Perform Comput Appl 24(4):511–515 | es_ES |
dc.description.references | Paterson MS, Stockmeyer LJ (1973) On the number of nonscalar multiplications necessary to evaluate polynomials. SIAM J Comput 2(1):60–66 | es_ES |
dc.description.references | PLASMA (2015) Parallel linear algebra software for multicore architectures. Available in: http://www.netlib.org/plasma/ . Accessed 1 June 2015 | es_ES |
dc.description.references | Tanaka T, Katagiri T, Yuba T (2007) D-spline based incremental parameter estimation in automatic performance tuning. In: International Conference on Applied Parallel Computing: State of the Art in Scientific Computing, PARA’06. Springer-Verlag, Berlin, Heidelberg, pp 986–995 | es_ES |
dc.description.references | Vuduc R, Demmel J, Bilmes J (2004) Statistical models for empirical search-based performance tuning. Int J High Perform Comput Appl 18:65–94 | es_ES |
dc.description.references | Whaley RC, Petitet A, Dongarra JJ (2001) Automated empirical optimizations of software and the ATLAS project. Parallel Comput 27:21–37 | es_ES |