Iserte Agut, Sergio; Mayo Gual, Rafael; Quintana Ortí, Enrique Salvador; Beltrán, Vicenç; Peña Monferrer, Antonio José(Elsevier, 2018)
[EN] Adaptive workloads can change on-the-fly the configuration of their jobs, in terms of number of processes. To carry out these job reconfigurations, we have designed a methodology which enables a job to communicate ...
Iserte, Sergio; Mayo, Rafael; Quintana-Ortí, Enrique S.; Peña Monferrer, Antonio José(Institute of Electrical and Electronics Engineers, 2021-09-01)
[EN] Process malleability has proved to have a highly positive impact on the resource utilization and global productivity in data centers compared with the conventional static resource allocation policy. However, the ...
Alonso-Jordá, Pedro; Dolz Zaragozá, Manuel Francisco; Igual, Francisco D.; Mayo, Rafael; Quintana Ortí, Enrique Salvador(Springer Verlag (Germany), 2012-11)
[EN] This paper analyzes the impact on power con- sumption of two DVFS-control strategies when applied to the execution of dense linear algebra operations on multi- core processors. The strategies considered here, prototyped ...
Barrachina, Sergio; Dolz, Manuel F.; San Juan, Pablo; Quintana-Ortí, Enrique S.(Elsevier, 2022-09)
[EN] Convolutional Neural Networks (CNNs) play a crucial role in many image recognition and classification tasks, recommender systems, brain-computer interfaces, etc. As a consequence, there is a notable interest in ...
[EN] The calculation of overlaps between many-electron wave functions at different nuclear geometries
during nonadiabatic dynamics simulations requires the evaluation of a large number of determinants of
matrices that ...
[EN] Near Threshold Voltage (NTV) computing has been recently proposed as a technique to save energy, at the cost of incurring higher error rates including, among others, Silent Data Corruption (SDC). In this paper, we ...
Alonso-Jordá, Pedro; Dolz Zaragozá, Manuel Francisco; Igual, Francisco D.; Mayo, Rafael; Quintana Ortí, Enrique Salvador(Wiley, 2014-10)
The road towards Exascale Computing requires a holistic effort to address three different challenges simultaneously: high performance, energy efficiency, and programmability. The use of runtime task schedulers to orchestrate ...
[EN] We address the parallelization of the LU factorization of hierarchical matrices (H-matrices) arising from boundary element methods. Our approach exploits task-parallelism via the OmpSs programming model and runtime, ...
Castelló-Gimeno, Adrián; Peña Monferrer, Antonio José; Mayo Gual, Rafael; Planas,Judit; Quintana Ortí, Enrique Salvador; Balaji, Pavan(Springer-Verlag, 2018-11)
[EN] Directive-based programming models, such as OpenMP, OpenACC, and OmpSs, enable users to accelerate applications by using coprocessors with little effort. These devices offer significant computing power, but their use ...
[EN] We investigate the factorized solution of generalized stable Sylvester equations such as those arising in model reduction, image restoration, and observer design. Our algorithms, based on the matrix sign function, ...
Alventosa, Fran J.; Alonso-Jordá, Pedro; Vidal Maciá, Antonio Manuel; Piñero, Gema; Quintana-Ortí, Enrique S.(Springer-Verlag, 2019-03)
[EN] The processing of digital sound signals often requires the computation of the QR factorization of a rectangular system matrix. However, sometimes, only a given (and probably small) part of the system matrix varies ...
[EN] We introduce a version of the epistasis test in FaST-LMM for clusters of multithreaded processors. This new software maintains the sensitivity of the original FaST-LMM while delivering acceleration that is close to ...
[EN] Resilience is considered a challenging under-addressed issue that the high performance computing community (HPC) will have to face in order to produce reliable Exascale systems by the beginning of the next decade. As ...
[EN] We present FloatX (Float eXtended), a C++ framework to investigate the effect of leveraging customized floating-point formats in numerical applications. FloatX formats are based on binary IEEE 754 with smaller significand ...
[EN] In this article, we present GINKGO, a modern C++ math library for scientific high performance computing. While classical linear algebra libraries act on matrix and vector objects, Gnswo's design principle abstracts ...
[EN] We propose a reproducible variant of the unblocked LU factorization for graphics processor units (GPUs). For this purpose, we build upon Level-1/2 BLAS kernels that deliver correctly-rounded and reproducible results ...
Castelló, Adrián; SERGIO BARRACHINA; DOLZ ZARAGOZÁ, MANUEL FRANCISCO; Enrique S. Quintana-Ortí; San Juan-Sebastian, Pablo; Tomás Domínguez, Andrés Enrique(Elsevier, 2022-04)
[EN] We evolve PyDTNN, a framework for distributed parallel training of Deep Neural Networks (DNNs), into an efficient inference tool for convolutional neural networks. Our optimization process on multicore ARM processors ...
As sequencing technologies progress, the amount of data produced grows exponentially, shifting
the bottleneck of discovery towards the data analysis phase. In particular, currently available mapping
solutions for RNA-seq ...
Reaño González, Carlos; Silla Jiménez, Federico; Castello Gimeno, Adrián; Peña Monferrer, Antonio José; Mayo Gual, Rafael; Quintana Ortí, Enrique Salvador; Duato Marín, José Francisco(Wiley, 2015-09-25)
Graphics processing units (GPUs) are being increasingly embraced by the high-performance computing
community as an effective way to reduce execution time by accelerating parts of their applications. remote
CUDA (rCUDA) ...
Reaño González, Carlos; Mayo Gual, Rafael; Quintana Ortí, Enrique Salvador; Silla Jiménez, Federico; Duato Marín, José Francisco; Peña Monferrer, Antonio José(IEEE, 2013-09-23)
The use of GPUs to accelerate general-purpose scientific and engineering applications is mainstream today, but their adoption in current high-performance computing clusters is impaired primarily by acquisition costs and ...