- -

Performance modeling of the sparse matrix-vector product via convolutional neural networks

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

Performance modeling of the sparse matrix-vector product via convolutional neural networks

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author Barreda, María es_ES
dc.contributor.author Dolz, Manuel F. es_ES
dc.contributor.author CASTAÑO ALVAREZ, MARIA ASUNCION es_ES
dc.contributor.author Alonso-Jordá, Pedro es_ES
dc.contributor.author Quintana-Orti, Enrique S. es_ES
dc.date.accessioned 2021-11-05T14:07:09Z
dc.date.available 2021-11-05T14:07:09Z
dc.date.issued 2020-11 es_ES
dc.identifier.uri http://hdl.handle.net/10251/176273
dc.description.abstract [EN] Modeling the execution time of the sparse matrix-vector multiplication (SpMV) on a current CPU architecture is especially complex due to (i) irregular memory accesses; (ii) indirect memory referencing; and (iii) low arithmetic intensity. While analytical models may yield accurate estimates for the total number of cache hits/misses, they often fail to predict accurately the total execution time. In this paper, we depart from the analytic approach to instead leverage convolutional neural networks (CNNs) in order to provide an effective estimation of the performance of the SpMV operation. For this purpose, we present a high-level abstraction of the sparsity pattern of the problem matrix and propose a blockwise strategy to feed the CNN models by blocks of nonzero elements. The experimental evaluation on a representative subset of the matrices from the SuiteSparse Matrix collection demonstrates the robustness of the CNN models for predicting the SpMV performance on an Intel Haswell core. Furthermore, we show how to generalize the network models to other target architectures to estimate the performance of SpMV on an ARM A57 core es_ES
dc.description.sponsorship This work was supported by project TIN2017-82972-R from the MINECO, Spain. Manuel F. Dolz was also supported by the Plan GenT project CDEIGENT/2018/014 from the Generalitat Valenciana, Spain. Maria Barreda was also supported by the POSDOC-A/2017/11 project from the Universitat Jaume I es_ES
dc.language Inglés es_ES
dc.publisher Springer-Verlag es_ES
dc.relation.ispartof The Journal of Supercomputing (Online) es_ES
dc.rights Reserva de todos los derechos es_ES
dc.subject Sparse matrix-vector multiplication (SpMV) es_ES
dc.subject Performance modeling es_ES
dc.subject Supervised learning es_ES
dc.subject Convolutional neural networks (CNNs) es_ES
dc.subject.classification CIENCIAS DE LA COMPUTACION E INTELIGENCIA ARTIFICIAL es_ES
dc.subject.classification ARQUITECTURA Y TECNOLOGIA DE COMPUTADORES es_ES
dc.title Performance modeling of the sparse matrix-vector product via convolutional neural networks es_ES
dc.type Artículo es_ES
dc.identifier.doi 10.1007/s11227-020-03186-1 es_ES
dc.relation.projectID info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2013-2016/TIN2017-82972-R/ES/TECNICAS ALGORITMICAS PARA COMPUTACION DE ALTO RENDIMIENTO CONSCIENTE DEL CONSUMO ENERGETICO Y RESISTENTE A ERRORES/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/UJI//POSDOC-A%2F2017%2F11/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/GVA//CDEIGENT%2F2018%2F014/ es_ES
dc.rights.accessRights Abierto es_ES
dc.contributor.affiliation Universitat Politècnica de València. Departamento de Informática de Sistemas y Computadores - Departament d'Informàtica de Sistemes i Computadors es_ES
dc.contributor.affiliation Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació es_ES
dc.description.bibliographicCitation Barreda, M.; Dolz, MF.; Castaño Alvarez, MA.; Alonso-Jordá, P.; Quintana-Orti, ES. (2020). Performance modeling of the sparse matrix-vector product via convolutional neural networks. The Journal of Supercomputing (Online). 76(11):8883-8900. https://doi.org/10.1007/s11227-020-03186-1 es_ES
dc.description.accrualMethod S es_ES
dc.relation.publisherversion https://doi.org/10.1007/s11227-020-03186-1 es_ES
dc.description.upvformatpinicio 8883 es_ES
dc.description.upvformatpfin 8900 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.description.volume 76 es_ES
dc.description.issue 11 es_ES
dc.identifier.eissn 1573-0484 es_ES
dc.relation.pasarela S\417899 es_ES
dc.contributor.funder Universitat Jaume I es_ES
dc.contributor.funder Generalitat Valenciana es_ES
dc.contributor.funder Agencia Estatal de Investigación es_ES
dc.description.references Abdelfattah A, Ltaief H, Keyes D (2015) High performance multi-GPU SpMV for multi-component PDE-based applications. In: Träff JL, Hunold S, Versaci F (eds) Euro-Par 2015: parallel processing. Springer, Berlin, pp 601–612 es_ES
dc.description.references Schiesser WE (2014) Computational mathematics in engineering and applied science: ODEs, DAEs, and PDEs. CRC Press, Boca Raton es_ES
dc.description.references Vuduc R, Demmel JW, Yelick KA (2005) OSKI: a library of automatically tuned sparse matrix kernels. J Phys Conf Ser 16:521–530 es_ES
dc.description.references Williams S, Oliker L, Vuduc R, Shalf J, Yelick K, Demmel J (2007) Optimization of sparse matrix–vector multiplication on emerging multicore platforms. In: SC ’07: Proceedings of the 2007 ACM/IEEE Conference on Supercomputing, pp 1–12 es_ES
dc.description.references Elafrou A, Goumas G, Koziris N (2017) Performance analysis and optimization of sparse matrix–vector multiplication on modern multi- and many-core processors. In: 2017 46th International Conference on Parallel Processing (ICPP), pp 292–301 es_ES
dc.description.references Li S, Chang H, Zhang J, Zhang Y (2015) Automatic tuning of sparse matrix–vector multiplication on multicore clusters. Sci China Inf Sci 58(9):1–14 es_ES
dc.description.references Guo P, Wang L (2015) Accurate cross-architecture performance modeling for sparse matri–vector multiplication (SpMV) on GPUs. Concurr Comput Pract Exp 27(13):3281–3294 es_ES
dc.description.references Li K, Yang W, Li K (2015) Performance analysis and optimization for SpMV on GPU using probabilistic modeling. IEEE Trans Parallel Distrib Syst 26(1):196–205 es_ES
dc.description.references Eijkhout V, Pozo R (1994) Data structures and algorithms for distributed sparse matrix operations. Technical report es_ES
dc.description.references Gu J, Wang Z, Kuen J, Ma L, Shahroudy A, Shuai B, Liu T, Wang X, Wang G, Cai J, Chen T (2018) Recent advances in convolutional neural networks. Pattern Recognit 77(C):354–377 es_ES
dc.description.references Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Gordon G, Dunson D, Dudík M (eds) Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, volume 15 of Proceedings of Machine Learning Research. Fort Lauderdale, FL, USA, 11–13. PMLR, pp 315–323 es_ES
dc.description.references Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of the 32nd International Conference on International Conference on Machine Learning, Volume 37 (ICML’15). JMLR org, pp 448–456 es_ES
dc.description.references Keras: The Python Deep Learning library. https://keras.io/. Accessed Dec 2019 es_ES
dc.description.references TensorFlow, an open source machine learning library for research and production. https://www.tensorflow.org/. Accessed Dec 2019 es_ES
dc.description.references Keras + Hyperopt: a very simple wrapper for convenient hyperparameter optimization. http://maxpumperla.com/hyperas/. Accessed Dec 2019 es_ES
dc.description.references Bergstra J, Komer B, Eliasmith C, Yamins D, Cox D (2015) Hyperopt: a python library for model selection and hyperparameter optimization. Comput Sci Discov. https://doi.org/10.1088/1749-4699/8/1/014008 es_ES
dc.description.references Bergstra J, Yamins D, Cox DD (2013) Making a science of model search: hyperparameter optimization in hundreds of dimensions for vision architectures. In: Proceedings of the 30th International Conference on International Conference on Machine Learning—Volume 28, ICML’13. JMLR.org, pp I–115–I–123 es_ES
dc.description.references SuiteSparse Matrix Collection. https://sparse.tamu.edu/. Accessed Dec 2019 es_ES
dc.description.references Bishop CM (2006) Pattern recognition and machine learning (information science and statistics). Springer, Berlin es_ES
dc.description.references Pan SJ, Yang Qiang (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359 es_ES
dc.description.references Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117 es_ES
dc.description.references LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–44 05 es_ES
dc.description.references Götz M, Anzt H (2018) Machine learning-aided numerical linear algebra: convolutional neural networks for the efficient preconditioner generation. In: Procs of ScalA’18: 9th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, WS at Supercomputing 2018, 11 es_ES
dc.description.references Zhao Y, Li J, Liao C, Shen X (2018) Bridging the gap between deep learning and sparse matrix format selection. SIGPLAN Not 53(1):94–108 es_ES
dc.description.references Cui H, Hirasawa S, Kobayashi H, Takizawa H (2018) A machine learning-based approach for selecting SpMV kernels and matrix storage formats. IEICE Trans Inf Syst E101.D(9):2307–2314 es_ES
dc.description.references Nisa I, Siegel C, Rajam AS, Vishnu A, Sadayappan P (2018) Effective machine learning based format selection and performance modeling for SpMV on GPUs. EasyChair Preprint no. 388, EasyChair es_ES
dc.description.references Tiwari A, Laurenzano MA, Carrington L, Snavely A (2012) Modeling power and energy usage of HPC kernels. In: 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops PhD Forum, pp 990–998 es_ES
dc.description.references Benatia A, Ji W, Wang Y, Shi F (2016) Machine learning approach for the predicting performance of SpMV on GPU. In: 2016 IEEE 22nd International Conference on Parallel and Distributed Systems (ICPADS), pp 894–901 es_ES


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem