- -

A BLIS-like matrix multiplication for machine learning in the RISC-V ISA-based GAP8 processor

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

A BLIS-like matrix multiplication for machine learning in the RISC-V ISA-based GAP8 processor

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author Ramírez-Betancourth, Cristian es_ES
dc.contributor.author Castelló, Adrián es_ES
dc.contributor.author Quintana-Ortí, Enrique S. es_ES
dc.date.accessioned 2023-07-21T18:04:35Z
dc.date.available 2023-07-21T18:04:35Z
dc.date.issued 2022-11 es_ES
dc.identifier.issn 0920-8542 es_ES
dc.identifier.uri http://hdl.handle.net/10251/195322
dc.description.abstract [EN] We address the efficient realization of matrix multiplication (gemm), with application in the convolution operator for machine learning, for the RISC-V core present in the GreenWaves GAP8 processor. Our approach leverages BLIS (Basic Linear Algebra Instantiation Software) to develop an implementation that (1) re-organizes the gemm algorithm adapting its micro-kernel to exploit the hardware-supported dot product kernel in the GAP8; (2) explicitly orchestrates the data transfers across the hierarchy of scratchpad memories via DMA (direct memory access); and (3) operates with integer arithmetic. es_ES
dc.description.sponsorship This work was supported by the research project PID2020-113656RB-C22 of MCIN/AEI/10.13039/501100011033. C. Ramirez is a "Santiago Grisolia" fellow supported by Generalitat Valenciana. Adrian Castello is a FJC2019-039222-I fellow supported by MCIN/AEI/10.13039/501100011033. This project has received funding from the European High-Performance Computing Joint Undertaking (JU) under Grant Agreement No. 955558. The JU receives support from the European Union's Horizon 2020 research and innovation program, and Spain, Germany, France, Italy, Poland, Switzerland, Norway. es_ES
dc.language Inglés es_ES
dc.publisher Springer-Verlag es_ES
dc.relation.ispartof The Journal of Supercomputing es_ES
dc.rights Reconocimiento (by) es_ES
dc.subject Matrix multiplication es_ES
dc.subject High performance es_ES
dc.subject RISC-V GAP8 es_ES
dc.subject.classification ARQUITECTURA Y TECNOLOGIA DE COMPUTADORES es_ES
dc.title A BLIS-like matrix multiplication for machine learning in the RISC-V ISA-based GAP8 processor es_ES
dc.type Artículo es_ES
dc.identifier.doi 10.1007/s11227-022-04581-6 es_ES
dc.relation.projectID info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2020-113656RB-C22/ES/COMPUTACION Y COMUNICACIONES DE ALTAS PRESTACIONES CONSCIENTES DEL CONSUMO ENERGETICO. APLICACIONES AL APRENDIZAJE PROFUNDO COMPUTACIONAL - UPV/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/AGENCIA ESTATAL DE INVESTIGACION//FJC2019-039222-I//AYUDA JUAN DE LA CIERVA FORMACION-CASTELLO GIMENO, ADRIAN/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/955558/EU es_ES
dc.relation.projectID info:eu-repo/grantAgreement/GENERALITAT VALENCIANA//GRISOLIAP%2F2020%2F086//AYUDA SANTIAGO GRISOLIA: COMPUTACION DE ALTAS PRESTACIONES CONSCIENTE DEL CONSUMO PARA REDES NEURONALES PROFUNDAS./ es_ES
dc.rights.accessRights Abierto es_ES
dc.contributor.affiliation Universitat Politècnica de València. Escola Tècnica Superior d'Enginyeria Informàtica es_ES
dc.contributor.affiliation Universitat Politècnica de València. Departamento de Informática de Sistemas y Computadores - Departament d'Informàtica de Sistemes i Computadors es_ES
dc.description.bibliographicCitation Ramírez-Betancourth, C.; Castelló, A.; Quintana-Ortí, ES. (2022). A BLIS-like matrix multiplication for machine learning in the RISC-V ISA-based GAP8 processor. The Journal of Supercomputing. 78(16):18051-18060. https://doi.org/10.1007/s11227-022-04581-6 es_ES
dc.description.accrualMethod S es_ES
dc.relation.publisherversion https://doi.org/10.1007/s11227-022-04581-6 es_ES
dc.description.upvformatpinicio 18051 es_ES
dc.description.upvformatpfin 18060 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.description.volume 78 es_ES
dc.description.issue 16 es_ES
dc.relation.pasarela S\468501 es_ES
dc.contributor.funder European Commission es_ES
dc.contributor.funder GENERALITAT VALENCIANA es_ES
dc.contributor.funder AGENCIA ESTATAL DE INVESTIGACION es_ES
dc.description.references Hazelwood K et al (2018) Applied machine learning at Facebook: a datacenter infrastructure perspective. In: IEEE International Symposium on High Performance Computer Architecture, pp 620–629 es_ES
dc.description.references Park J et al (2018) Deep learning inference in Facebook data centers: characterization, performance optimizations and hardware implications. arXiv:1811.09886 es_ES
dc.description.references Wu C et al (2019) Machine learning at Facebook: understanding inference at the edge. In: International Symposium on High Performance Computer Architecture, pp 331–344 es_ES
dc.description.references Yi S, Li C, Li Q (2015) A survey of fog computing: concepts, applications and issues. In: Proceedings of the 2015 Workshop on Mobile Big Data, ser. Mobidata’15, pp 37–42 es_ES
dc.description.references Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems (NIPS), vol 1, pp 1097–1105 es_ES
dc.description.references Pouyanfar S et al (2018) A survey on deep learning: algorithms, techniques, and applications. ACM Comput Surv 51(5):92:1-92:36 es_ES
dc.description.references Sze V et al (2017) Efficient processing of deep neural networks: a tutorial and survey. Proc IEEE 105(12):2295–2329 es_ES
dc.description.references Chellapilla K, Puri S, Simard P (2006) High performance convolutional neural networks for document processing. In: Tenth International Workshop on Frontiers in Handwriting Recognition es_ES
dc.description.references Georganas E et al (2018) Anatomy of high-performance deep learning convolutions on SIMD architectures. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, ser. SC ’18. IEEE Press es_ES
dc.description.references San Juan P et al (2020) High performance and portable convolution operators for multicore processors. In: Proceedings of the 32nd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), pp 91–98 es_ES
dc.description.references Van Zee FG, van de Geijn RA (2015) BLIS: a framework for rapidly instantiating BLAS functionality. ACM Trans Math Softw 41(3):14:1-14:33 es_ES
dc.description.references Flamand E et al (2018) GAP-8: A RISC-V SoC for AI at the edge of the IoT. In: IEEE 29th Interantional Conference on Application-Specific Systems, Architectures and Processors, pp 1–4 es_ES
dc.description.references Ali M et al (2012) Level-3 blas on the ti c6678 multi-core dsp. In: 2012 IEEE 24th International Sympsoium on Computer Architecture and High Performance Computing es_ES
dc.description.references Lavin A, Gray S (2016) Fast algorithms for convolutional neural networks. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 4013–4021 es_ES
dc.description.references Zlateski A, Jia Z, Li K, Durand F (2019) The anatomy of efficient FFT and Winograd convolutions on modern CPUs. In: Proceedings of the ACM International Conference on Supercomputing, ser. ICS ’19, pp 414–424 es_ES
dc.description.references Low TM et al (2016) Analytical modeling is enough for high-performance BLIS. ACM Trans Math Softw 43(2) es_ES
dc.description.references Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 es_ES
dc.description.references Gunnels JA, Gustavson FG, Henry GM, van de Geijn RA (2004) A family of high-performance matrix multiplication algorithms. In: Proceedings of the 7th International Conference on Applied Parallel Computing: State of the Art in Scientific Computing, ser. PARA’04. Berlin, Heidelberg, pp 256–265. https://doi.org/10.1007/11558958_30 es_ES
dc.description.references Smith TM, van de Geijn RA (2019) The MOMMS family of matrix multiplication algorithms. CoRR, vol. abs/1904.05717. arXiv:1904.05717 es_ES
dc.description.references Castelló A, Igual FD, Quintana-Ortí ES (2022) Anatomy of the BLIS family of algorithms for matrix multiplication. In: 30th Euromicro Workshop on Parallel, Distributed and Networked Processing PDP 2022, to appear es_ES
upv.costeAPC 2345 es_ES


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem