Catalán, Sandra; Herrero, José R.; Quintana Ortí, Enrique Salvador; Rodríguez-Sánchez, Rafael; van de Geijn, Robert(Institute of Electrical and Electronics Engineers, 2019-01-31)
[EN] We propose two novel techniques for overcoming load-imbalance encountered when implementing so-called look-ahead mechanisms in relevant dense matrix factorizations for the solution of linear systems. Both techniques ...
[EN] Near Threshold Voltage (NTV) computing has been recently proposed as a technique to save energy, at the cost of incurring higher error rates including, among others, Silent Data Corruption (SDC). In this paper, we ...
San Juan-Sebastian, Pablo; Rodríguez-Sánchez, Rafael; Igual, Francisco D.; Alonso-Jordá, Pedro; Quintana-Ortí, Enrique S.(Springer-Verlag, 2021-10)
[EN] We introduce a high performance, multi-threaded realization of the gemm kernel for the ARMv8.2 architecture that operates with 16-bit (half precision)/queryKindly check and confirm whether the corresponding author is ...
Catalán, Sandra; Herrero, José R.; Igual Peña, Francisco Daniel; Rodríguez-Sánchez, Rafael; Quintana Ortí, Enrique Salvador; Adeniyi-Jones, Chris(Elsevier, 2018-03)
[EN] Dense linear algebra libraries, such as BLAS and LAPACK, provide a relevant collection of numerical tools for many scientific and engineering applications. While there exist high performance implementations of the ...
[EN] We investigate a parallelization strategy for dense matrix factorization (DMF) algorithms, using OpenMP, that departs from the legacy (or conventional) solution, which simply extracts concurrency from a multi-threaded ...
[EN] We investigate how to leverage the heterogeneous resources of an Asymmetric Multicore Processor (AMP) in order to deliver high performance in the reduction to condensed forms for the solution of dense eigenvalue and ...