A BLIS-like matrix multiplication for machine learning in the RISC-V ISA-based GAP8 processor

Ramírez-Betancourth, Cristian; Castelló, Adrián; Quintana-Ortí, Enrique S.

doi:10.1007/s11227-022-04581-6

Identificarse

Buscar en RiuNet

Listar

Todo RiuNet
Esta colección

Mi cuenta

Acceder

Estadísticas

Ver Estadísticas de uso

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

A BLIS-like matrix multiplication for machine learning in the RISC-V ISA-based GAP8 processor

Mostrar el registro sencillo del ítem

Ficheros en el ítem

Nombre: Ramirez-Betancour ...

Tamaño: 857.0Kb

Formato: PDF

Descripción: Versión editorial

Abrir

dc.contributor.author	Ramírez-Betancourth, Cristian	es_ES
dc.contributor.author	Castelló, Adrián	es_ES
dc.contributor.author	Quintana-Ortí, Enrique S.	es_ES
dc.date.accessioned	2023-07-21T18:04:35Z
dc.date.available	2023-07-21T18:04:35Z
dc.date.issued	2022-11	es_ES
dc.identifier.issn	0920-8542	es_ES
dc.identifier.uri	http://hdl.handle.net/10251/195322
dc.description.abstract	[EN] We address the efficient realization of matrix multiplication (gemm), with application in the convolution operator for machine learning, for the RISC-V core present in the GreenWaves GAP8 processor. Our approach leverages BLIS (Basic Linear Algebra Instantiation Software) to develop an implementation that (1) re-organizes the gemm algorithm adapting its micro-kernel to exploit the hardware-supported dot product kernel in the GAP8; (2) explicitly orchestrates the data transfers across the hierarchy of scratchpad memories via DMA (direct memory access); and (3) operates with integer arithmetic.	es_ES
dc.description.sponsorship	This work was supported by the research project PID2020-113656RB-C22 of MCIN/AEI/10.13039/501100011033. C. Ramirez is a "Santiago Grisolia" fellow supported by Generalitat Valenciana. Adrian Castello is a FJC2019-039222-I fellow supported by MCIN/AEI/10.13039/501100011033. This project has received funding from the European High-Performance Computing Joint Undertaking (JU) under Grant Agreement No. 955558. The JU receives support from the European Union's Horizon 2020 research and innovation program, and Spain, Germany, France, Italy, Poland, Switzerland, Norway.	es_ES
dc.language	Inglés	es_ES
dc.publisher	Springer-Verlag	es_ES
dc.relation.ispartof	The Journal of Supercomputing	es_ES
dc.rights	Reconocimiento (by)	es_ES
dc.subject	Matrix multiplication	es_ES
dc.subject	High performance	es_ES
dc.subject	RISC-V GAP8	es_ES
dc.subject.classification	ARQUITECTURA Y TECNOLOGIA DE COMPUTADORES	es_ES
dc.title	A BLIS-like matrix multiplication for machine learning in the RISC-V ISA-based GAP8 processor	es_ES
dc.type	Artículo	es_ES
dc.identifier.doi	10.1007/s11227-022-04581-6	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2020-113656RB-C22/ES/COMPUTACION Y COMUNICACIONES DE ALTAS PRESTACIONES CONSCIENTES DEL CONSUMO ENERGETICO. APLICACIONES AL APRENDIZAJE PROFUNDO COMPUTACIONAL - UPV/	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/AGENCIA ESTATAL DE INVESTIGACION//FJC2019-039222-I//AYUDA JUAN DE LA CIERVA FORMACION-CASTELLO GIMENO, ADRIAN/	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/EC/H2020/955558/EU	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/GENERALITAT VALENCIANA//GRISOLIAP%2F2020%2F086//AYUDA SANTIAGO GRISOLIA: COMPUTACION DE ALTAS PRESTACIONES CONSCIENTE DEL CONSUMO PARA REDES NEURONALES PROFUNDAS./	es_ES
dc.rights.accessRights	Abierto	es_ES
dc.contributor.affiliation	Universitat Politècnica de València. Escola Tècnica Superior d'Enginyeria Informàtica	es_ES
dc.contributor.affiliation	Universitat Politècnica de València. Departamento de Informática de Sistemas y Computadores - Departament d'Informàtica de Sistemes i Computadors	es_ES
dc.description.bibliographicCitation	Ramírez-Betancourth, C.; Castelló, A.; Quintana-Ortí, ES. (2022). A BLIS-like matrix multiplication for machine learning in the RISC-V ISA-based GAP8 processor. The Journal of Supercomputing. 78(16):18051-18060. https://doi.org/10.1007/s11227-022-04581-6	es_ES
dc.description.accrualMethod	S	es_ES
dc.relation.publisherversion	https://doi.org/10.1007/s11227-022-04581-6	es_ES
dc.description.upvformatpinicio	18051	es_ES
dc.description.upvformatpfin	18060	es_ES
dc.type.version	info:eu-repo/semantics/publishedVersion	es_ES
dc.description.volume	78	es_ES
dc.description.issue	16	es_ES
dc.relation.pasarela	S\468501	es_ES
dc.contributor.funder	European Commission	es_ES
dc.contributor.funder	GENERALITAT VALENCIANA	es_ES
dc.contributor.funder	AGENCIA ESTATAL DE INVESTIGACION	es_ES
dc.description.references	Hazelwood K et al (2018) Applied machine learning at Facebook: a datacenter infrastructure perspective. In: IEEE International Symposium on High Performance Computer Architecture, pp 620–629	es_ES
dc.description.references	Park J et al (2018) Deep learning inference in Facebook data centers: characterization, performance optimizations and hardware implications. arXiv:1811.09886	es_ES
dc.description.references	Wu C et al (2019) Machine learning at Facebook: understanding inference at the edge. In: International Symposium on High Performance Computer Architecture, pp 331–344	es_ES
dc.description.references	Yi S, Li C, Li Q (2015) A survey of fog computing: concepts, applications and issues. In: Proceedings of the 2015 Workshop on Mobile Big Data, ser. Mobidata’15, pp 37–42	es_ES
dc.description.references	Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems (NIPS), vol 1, pp 1097–1105	es_ES
dc.description.references	Pouyanfar S et al (2018) A survey on deep learning: algorithms, techniques, and applications. ACM Comput Surv 51(5):92:1-92:36	es_ES
dc.description.references	Sze V et al (2017) Efficient processing of deep neural networks: a tutorial and survey. Proc IEEE 105(12):2295–2329	es_ES
dc.description.references	Chellapilla K, Puri S, Simard P (2006) High performance convolutional neural networks for document processing. In: Tenth International Workshop on Frontiers in Handwriting Recognition	es_ES
dc.description.references	Georganas E et al (2018) Anatomy of high-performance deep learning convolutions on SIMD architectures. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, ser. SC ’18. IEEE Press	es_ES
dc.description.references	San Juan P et al (2020) High performance and portable convolution operators for multicore processors. In: Proceedings of the 32nd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), pp 91–98	es_ES
dc.description.references	Van Zee FG, van de Geijn RA (2015) BLIS: a framework for rapidly instantiating BLAS functionality. ACM Trans Math Softw 41(3):14:1-14:33	es_ES
dc.description.references	Flamand E et al (2018) GAP-8: A RISC-V SoC for AI at the edge of the IoT. In: IEEE 29th Interantional Conference on Application-Specific Systems, Architectures and Processors, pp 1–4	es_ES
dc.description.references	Ali M et al (2012) Level-3 blas on the ti c6678 multi-core dsp. In: 2012 IEEE 24th International Sympsoium on Computer Architecture and High Performance Computing	es_ES
dc.description.references	Lavin A, Gray S (2016) Fast algorithms for convolutional neural networks. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 4013–4021	es_ES
dc.description.references	Zlateski A, Jia Z, Li K, Durand F (2019) The anatomy of efficient FFT and Winograd convolutions on modern CPUs. In: Proceedings of the ACM International Conference on Supercomputing, ser. ICS ’19, pp 414–424	es_ES
dc.description.references	Low TM et al (2016) Analytical modeling is enough for high-performance BLIS. ACM Trans Math Softw 43(2)	es_ES
dc.description.references	Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861	es_ES
dc.description.references	Gunnels JA, Gustavson FG, Henry GM, van de Geijn RA (2004) A family of high-performance matrix multiplication algorithms. In: Proceedings of the 7th International Conference on Applied Parallel Computing: State of the Art in Scientific Computing, ser. PARA’04. Berlin, Heidelberg, pp 256–265. https://doi.org/10.1007/11558958_30	es_ES
dc.description.references	Smith TM, van de Geijn RA (2019) The MOMMS family of matrix multiplication algorithms. CoRR, vol. abs/1904.05717. arXiv:1904.05717	es_ES
dc.description.references	Castelló A, Igual FD, Quintana-Ortí ES (2022) Anatomy of the BLIS family of algorithms for matrix multiplication. In: 30th Euromicro Workshop on Parallel, Distributed and Networked Processing PDP 2022, to appear	es_ES
upv.costeAPC	2345	es_ES

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem

A BLIS-like matrix multiplication for machine learning in the RISC-V ISA-based GAP8 processor

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Buscar en RiuNet

Listar

Todo RiuNet

Esta colección

Mi cuenta

Estadísticas

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

A BLIS-like matrix multiplication for machine learning in the RISC-V ISA-based GAP8 processor

Ficheros en el ítem

Este ítem aparece en la(s) siguiente(s) colección(ones)