- -

A BLIS-like matrix multiplication for machine learning in the RISC-V ISA-based GAP8 processor

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

A BLIS-like matrix multiplication for machine learning in the RISC-V ISA-based GAP8 processor

Mostrar el registro completo del ítem

Ramírez-Betancourth, C.; Castelló, A.; Quintana-Ortí, ES. (2022). A BLIS-like matrix multiplication for machine learning in the RISC-V ISA-based GAP8 processor. The Journal of Supercomputing. 78(16):18051-18060. https://doi.org/10.1007/s11227-022-04581-6

Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10251/195322

Ficheros en el ítem

Metadatos del ítem

Título: A BLIS-like matrix multiplication for machine learning in the RISC-V ISA-based GAP8 processor
Autor: Ramírez-Betancourth, Cristian Castelló, Adrián Quintana-Ortí, Enrique S.
Entidad UPV: Universitat Politècnica de València. Escola Tècnica Superior d'Enginyeria Informàtica
Universitat Politècnica de València. Departamento de Informática de Sistemas y Computadores - Departament d'Informàtica de Sistemes i Computadors
Fecha difusión:
Resumen:
[EN] We address the efficient realization of matrix multiplication (gemm), with application in the convolution operator for machine learning, for the RISC-V core present in the GreenWaves GAP8 processor. Our approach ...[+]
Palabras clave: Matrix multiplication , High performance , RISC-V GAP8
Derechos de uso: Reconocimiento (by)
Fuente:
The Journal of Supercomputing. (issn: 0920-8542 )
DOI: 10.1007/s11227-022-04581-6
Editorial:
Springer-Verlag
Versión del editor: https://doi.org/10.1007/s11227-022-04581-6
Código del Proyecto:
info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2020-113656RB-C22/ES/COMPUTACION Y COMUNICACIONES DE ALTAS PRESTACIONES CONSCIENTES DEL CONSUMO ENERGETICO. APLICACIONES AL APRENDIZAJE PROFUNDO COMPUTACIONAL - UPV/
info:eu-repo/grantAgreement/AGENCIA ESTATAL DE INVESTIGACION//FJC2019-039222-I//AYUDA JUAN DE LA CIERVA FORMACION-CASTELLO GIMENO, ADRIAN/
info:eu-repo/grantAgreement/EC/H2020/955558/EU
info:eu-repo/grantAgreement/GENERALITAT VALENCIANA//GRISOLIAP%2F2020%2F086//AYUDA SANTIAGO GRISOLIA: COMPUTACION DE ALTAS PRESTACIONES CONSCIENTE DEL CONSUMO PARA REDES NEURONALES PROFUNDAS./
Agradecimientos:
This work was supported by the research project PID2020-113656RB-C22 of MCIN/AEI/10.13039/501100011033. C. Ramirez is a "Santiago Grisolia" fellow supported by Generalitat Valenciana. Adrian Castello is a FJC2019-039222-I ...[+]
Tipo: Artículo

References

Hazelwood K et al (2018) Applied machine learning at Facebook: a datacenter infrastructure perspective. In: IEEE International Symposium on High Performance Computer Architecture, pp 620–629

Park J et al (2018) Deep learning inference in Facebook data centers: characterization, performance optimizations and hardware implications. arXiv:1811.09886

Wu C et al (2019) Machine learning at Facebook: understanding inference at the edge. In: International Symposium on High Performance Computer Architecture, pp 331–344 [+]
Hazelwood K et al (2018) Applied machine learning at Facebook: a datacenter infrastructure perspective. In: IEEE International Symposium on High Performance Computer Architecture, pp 620–629

Park J et al (2018) Deep learning inference in Facebook data centers: characterization, performance optimizations and hardware implications. arXiv:1811.09886

Wu C et al (2019) Machine learning at Facebook: understanding inference at the edge. In: International Symposium on High Performance Computer Architecture, pp 331–344

Yi S, Li C, Li Q (2015) A survey of fog computing: concepts, applications and issues. In: Proceedings of the 2015 Workshop on Mobile Big Data, ser. Mobidata’15, pp 37–42

Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems (NIPS), vol 1, pp 1097–1105

Pouyanfar S et al (2018) A survey on deep learning: algorithms, techniques, and applications. ACM Comput Surv 51(5):92:1-92:36

Sze V et al (2017) Efficient processing of deep neural networks: a tutorial and survey. Proc IEEE 105(12):2295–2329

Chellapilla K, Puri S, Simard P (2006) High performance convolutional neural networks for document processing. In: Tenth International Workshop on Frontiers in Handwriting Recognition

Georganas E et al (2018) Anatomy of high-performance deep learning convolutions on SIMD architectures. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, ser. SC ’18. IEEE Press

San Juan P et al (2020) High performance and portable convolution operators for multicore processors. In: Proceedings of the 32nd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), pp 91–98

Van Zee FG, van de Geijn RA (2015) BLIS: a framework for rapidly instantiating BLAS functionality. ACM Trans Math Softw 41(3):14:1-14:33

Flamand E et al (2018) GAP-8: A RISC-V SoC for AI at the edge of the IoT. In: IEEE 29th Interantional Conference on Application-Specific Systems, Architectures and Processors, pp 1–4

Ali M et al (2012) Level-3 blas on the ti c6678 multi-core dsp. In: 2012 IEEE 24th International Sympsoium on Computer Architecture and High Performance Computing

Lavin A, Gray S (2016) Fast algorithms for convolutional neural networks. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 4013–4021

Zlateski A, Jia Z, Li K, Durand F (2019) The anatomy of efficient FFT and Winograd convolutions on modern CPUs. In: Proceedings of the ACM International Conference on Supercomputing, ser. ICS ’19, pp 414–424

Low TM et al (2016) Analytical modeling is enough for high-performance BLIS. ACM Trans Math Softw 43(2)

Howard AG, Zhu M, Chen B, Kalenichenko D, Wang W, Weyand T, Andreetto M, Adam H (2017) Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861

Gunnels JA, Gustavson FG, Henry GM, van de Geijn RA (2004) A family of high-performance matrix multiplication algorithms. In: Proceedings of the 7th International Conference on Applied Parallel Computing: State of the Art in Scientific Computing, ser. PARA’04. Berlin, Heidelberg, pp 256–265. https://doi.org/10.1007/11558958_30

Smith TM, van de Geijn RA (2019) The MOMMS family of matrix multiplication algorithms. CoRR, vol. abs/1904.05717. arXiv:1904.05717

Castelló A, Igual FD, Quintana-Ortí ES (2022) Anatomy of the BLIS family of algorithms for matrix multiplication. In: 30th Euromicro Workshop on Parallel, Distributed and Networked Processing PDP 2022, to appear

[-]

recommendations

 

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro completo del ítem