[EN] We address the efficient realization of matrix multiplication (gemm), with application in the convolution operator for machine learning, for the RISC-V core present in the GreenWaves GAP8 processor. Our approach ...
Coll Carrillo, Hugo(Universitat Politècnica de València, 2014-06-09)
[ES] La evolución en la tecnología del hormigón de las últimas décadas tiene una de sus mayores
exponentes
Hormigón de Muy Alto Rendimiento, HMAR (UHPC o UHPRFC en la bibliografía internacional). Este
material combina ...
[EN] The design of racing drones brings quite a thrilling challenge from a flight dynamics point of view. This work aims to offer a single-based simulation platform combining its geometric design, trajectory control, and ...
Jiménez Macedo, Víctor Daniel(Universitat Politècnica de València, 2013-07-08)
Un modelo de simulación presenta muchas ventajas en el campo del desarrollo de motores de combustión interna alternativos. Su utilidad es doble. Por un lado, para entender la naturaleza de los fenómenos físicos que suceden ...
Ferri Llinares, Raúl(Universitat Politècnica de València, 2019-09-16)
[ES] El presente documento conforma la memoria del trabajo de fin de grado ¿Diseño de una bancada de ensayos para el recubrimiento de ruedas de un prototipo de Hyperloop¿ de la titulación de Grado en Ingeniería de las ...
Castaño del Olmo, Damià(Universitat Politècnica de València, 2019-11-06)
[ES] El proyecto se centra en el diseño y optimización del componente más importante de una bicicleta, su cuadro, ya que todos los elementos que componen una bicicleta van unidos a él. Mediante el empleo de materiales ...
Barrachina, Sergio; Dolz, Manuel F.; San Juan, Pablo; Quintana-Ortí, Enrique S.(Elsevier, 2022-09)
[EN] Convolutional Neural Networks (CNNs) play a crucial role in many image recognition and classification tasks, recommender systems, brain-computer interfaces, etc. As a consequence, there is a notable interest in ...
[EN] We take a step forward towards developing high-performance codes for the convolution operator, based on the Winograd algorithm, that are easy to customise for general-purpose processor architectures. In our approach, ...
[EN] Near Threshold Voltage (NTV) computing has been recently proposed as a technique to save energy, at the cost of incurring higher error rates including, among others, Silent Data Corruption (SDC). In this paper, we ...
Catalán, Sandra; Herrero, José R.; Igual, Francisco D.; Quintana-Ortí, Enrique S.; Rodríguez-Sánchez, Rafael(John Wiley & Sons, 2023-12-10)
[EN] We extend a two-level task partitioning previously applied to the inversion of dense matrices via Gauss-Jordan elimination to the more challenging QR factorization as well as the initial orthogonal reduction to band ...
San Juan-Sebastian, Pablo; Rodríguez-Sánchez, Rafael; Igual, Francisco D.; Alonso-Jordá, Pedro; Quintana-Ortí, Enrique S.(Springer-Verlag, 2021-10)
[EN] We introduce a high performance, multi-threaded realization of the gemm kernel for the ARMv8.2 architecture that operates with 16-bit (half precision)/queryKindly check and confirm whether the corresponding author is ...
[EN] Many numerical algorithms for science and engineering applications require the solution of sparse triangular linear systems (sptrsv) as their most costly stage. For this reason, considerable research has been dedicated ...
[EN] We provide a practical demonstration that it is possible to systematically generate a variety of high-performance micro-kernels for the general matrix multiplication (gemm) via generic templates which can be easily ...
[EN] We present a reliable and efficient FPGA implementation of a procedure for the computation of the noise estimation matrix, a key stage for subspace identification of hyperspectral images. Our hardware realization is ...
[EN] In this work, we assess the performance and energy efciency of high-performance
codes for the convolution operator, based on the direct, explicit/implicit lowering and Winograd algorithms used for deep learning (DL) ...
[EN] We present a novel method for the QR factorization of large tall-and-skinny matrices that introduces an approximation technique for computing the Householder vectors. This approach is very competitive on a hybrid ...
[EN] We present accurate piece-wise models for the time and energy costs of high performance implementations of both the matrix multiplication (gemm) and the triangular system solve with multiple right-hand sides (trsm) ...
[EN] The roofline model not only provides a powerful tool to relate an application's performance with the specific constraints imposed by the target hardware but also offers a graphic representation of the balance between ...