BADÍA CONTELLES, JOSÉ MANUEL; Belloch Rodríguez, José Antonio; Cobos Serrano, Máximo; IGUAL PEÑA, FRANCISCO DANIEL; Quintana-Ortí, Enrique S.(Springer-Verlag, 2019-03)
[EN] The Steered Response Power with Phase Transform (SRP-PHAT) algorithm is a well-known method for sound source localization due to its robust performance in noisy and reverberant environments. This algorithm is used in ...
San Juan-Sebastian, Pablo; Rodríguez-Sánchez, Rafael; Igual, Francisco D.; Alonso-Jordá, Pedro; Quintana-Ortí, Enrique S.(Springer-Verlag, 2021-10)
[EN] We introduce a high performance, multi-threaded realization of the gemm kernel for the ARMv8.2 architecture that operates with 16-bit (half precision)/queryKindly check and confirm whether the corresponding author is ...
[EN] We provide a practical demonstration that it is possible to systematically generate a variety of high-performance micro-kernels for the general matrix multiplication (gemm) via generic templates which can be easily ...
Catalán, Sandra; Herrero, José R.; Igual Peña, Francisco Daniel; Rodríguez-Sánchez, Rafael; Quintana Ortí, Enrique Salvador; Adeniyi-Jones, Chris(Elsevier, 2018-03)
[EN] Dense linear algebra libraries, such as BLAS and LAPACK, provide a relevant collection of numerical tools for many scientific and engineering applications. While there exist high performance implementations of the ...
Belloch Rodríguez, José Antonio; Badia Contelles, J. M.; Igual Peña, Francisco Daniel; Gonzalez, Alberto; Quintana Ortí, Enrique Salvador(Institute of Electrical and Electronics Engineers, 2017-11)
[EN] Numerous signal processing applications are emerging on both mobile and high-performance computing systems. These applications are subject to responsiveness constraints for user interactivity and, at the same time, ...
Catalán, Sandra; Igual, Francisco D.; Herrero, José R.; Rodríguez-Sánchez, Rafael; Quintana-Ortí, Enrique S.(Elsevier, 2023-05)
[EN] We propose a methodology to address the programmability issues derived from the emergence of newgeneration shared-memory NUMA architectures. For this purpose, we employ dense matrix factorizations and matrix inversion ...
[EN] We investigate a parallelization strategy for dense matrix factorization (DMF) algorithms, using OpenMP, that departs from the legacy (or conventional) solution, which simply extracts concurrency from a multi-threaded ...