Reproducibility of parallel preconditioned conjugate gradient in hybrid programming environments

Iakymchuk, Roman; Barreda Vayá, Maria; Graillat, Stef; Aliaga, José I.; Quintana Ortí, Enrique Salvador

doi:10.1177/1094342020932650

Identificarse

Buscar en RiuNet

Listar

Todo RiuNet
Esta colección

Mi cuenta

Acceder

Estadísticas

Ver Estadísticas de uso

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Reproducibility of parallel preconditioned conjugate gradient in hybrid programming environments

Mostrar el registro completo del ítem

Iakymchuk, R.; Barreda Vayá, M.; Graillat, S.; Aliaga, JI.; Quintana Ortí, ES. (2020). Reproducibility of parallel preconditioned conjugate gradient in hybrid programming environments. International Journal of High Performance Computing Applications. 34(5):502-518. https://doi.org/10.1177/1094342020932650

Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10251/169416

Ficheros en el ítem

Nombre: IakymchukBarredaG ...

Tamaño: 558.8Kb

Formato: PDF

Descripción: Versión del Autor.

Abrir/Preview

Nombre: 1094342020932650.pdf

Tamaño: 1.050Mb

Formato: PDF

Descripción: Versión editorial

Solicitar una copia al autor

Metadatos del ítem

Título:

Reproducibility of parallel preconditioned conjugate gradient in hybrid programming environments

Autor:

Iakymchuk, Roman Barreda Vayá, Maria Graillat, Stef Aliaga, José I.

Quintana Ortí, Enrique Salvador

Entidad UPV:

Universitat Politècnica de València. Departamento de Informática de Sistemas y Computadores - Departament d'Informàtica de Sistemes i Computadors

Fecha difusión:

2020-09

Resumen:

[EN] The Preconditioned Conjugate Gradient method is often employed for the solution of linear systems of equations arising in numerical simulations of physical phenomena. While being widely used, the solver is also known ...[+]

Palabras clave:

Preconditioned conjugate gradient , MPI , OpenMP tasks , Reproducibility , Accuracy , Floating-point expansion , Long accumulator , Fused multiply-add

Derechos de uso:

Reserva de todos los derechos

Fuente:

International Journal of High Performance Computing Applications. (issn: 1094-3420 )

DOI:

10.1177/1094342020932650

Editorial:

SAGE Publications

Versión del editor:

https://doi.org/10.1177/1094342020932650

Código del Proyecto:

info:eu-repo/grantAgreement/EC/H2020/730897/EU/Transnational Access Programme for a Pan-European Network of HPC Research Infrastructures and Laboratories for scientific computing/
info:eu-repo/grantAgreement/UJI//POSDOC-A%2F2017%2F11/
info:eu-repo/grantAgreement/EC/H2020/842528/EU/Robust and Energy-Efficient Numerical Solvers Towards Reliable and Sustainable Scientific Computations/
info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2013-2016/TIN2017-82972-R/ES/TECNICAS ALGORITMICAS PARA COMPUTACION DE ALTO RENDIMIENTO CONSCIENTE DEL CONSUMO ENERGETICO Y RESISTENTE A ERRORES/

Agradecimientos:

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was partially supported by the European Union's Horizon 2020 research, ...[+]

Tipo:

Artículo

References

Aliaga, J. I., Barreda, M., Flegar, G., Bollhöfer, M., & Quintana-Ortí, E. S. (2017). Communication in task-parallel ILU-preconditioned CG solvers using MPI + OmpSs. Concurrency and Computation: Practice and Experience, 29(21), e4280. doi:10.1002/cpe.4280

Bailey, D. H. (2013). High-precision computation: Applications and challenges [Keynote I]. 2013 IEEE 21st Symposium on Computer Arithmetic. doi:10.1109/arith.2013.39

Barrett, R., Berry, M., Chan, T. F., Demmel, J., Donato, J., Dongarra, J., … van der Vorst, H. (1994). Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods. doi:10.1137/1.9781611971538

Burgess, N., Goodyer, C., Hinds, C. N., & Lutz, D. R. (2019). High-Precision Anchored Accumulators for Reproducible Floating-Point Summation. IEEE Transactions on Computers, 68(7), 967-978. doi:10.1109/tc.2018.2855729

Carson, E., & Higham, N. J. (2018). Accelerating the Solution of Linear Systems by Iterative Refinement in Three Precisions. SIAM Journal on Scientific Computing, 40(2), A817-A847. doi:10.1137/17m1140819

Collange, S., Defour, D., Graillat, S., & Iakymchuk, R. (2015). Numerical reproducibility for the parallel reduction on multi- and many-core architectures. Parallel Computing, 49, 83-97. doi:10.1016/j.parco.2015.09.001

Dekker, T. J. (1971). A floating-point technique for extending the available precision. Numerische Mathematik, 18(3), 224-242. doi:10.1007/bf01397083

Demmel, J., & Hong Diep Nguyen. (2013). Fast Reproducible Floating-Point Summation. 2013 IEEE 21st Symposium on Computer Arithmetic. doi:10.1109/arith.2013.9

Demmel, J., & Nguyen, H. D. (2015). Parallel Reproducible Summation. IEEE Transactions on Computers, 64(7), 2060-2070. doi:10.1109/tc.2014.2345391

Dongarra, J. J., Du Croz, J., Hammarling, S., & Duff, I. S. (1990). A set of level 3 basic linear algebra subprograms. ACM Transactions on Mathematical Software, 16(1), 1-17. doi:10.1145/77626.79170

Fousse, L., Hanrot, G., Lefèvre, V., Pélissier, P., & Zimmermann, P. (2007). MPFR. ACM Transactions on Mathematical Software, 33(2), 13. doi:10.1145/1236463.1236468

Hida, Y., Li, X. S., & Bailey, D. H. (s. f.). Algorithms for quad-double precision floating point arithmetic. Proceedings 15th IEEE Symposium on Computer Arithmetic. ARITH-15 2001. doi:10.1109/arith.2001.930115

Hunold, S., & Carpen-Amarie, A. (2016). Reproducible MPI Benchmarking is Still Not as Easy as You Think. IEEE Transactions on Parallel and Distributed Systems, 27(12), 3617-3630. doi:10.1109/tpds.2016.2539167

IEEE Computer Society (2008) IEEE Standard for Floating-Point Arithmetic. Piscataway: IEEE Standard, pp. 754–2008.

Kulisch, U., & Snyder, V. (2010). The exact dot product as basic tool for long interval arithmetic. Computing, 91(3), 307-313. doi:10.1007/s00607-010-0127-7

Kulisch, U. (2013). Computer Arithmetic and Validity. doi:10.1515/9783110301793

Lawson, C. L., Hanson, R. J., Kincaid, D. R., & Krogh, F. T. (1979). Basic Linear Algebra Subprograms for Fortran Usage. ACM Transactions on Mathematical Software, 5(3), 308-323. doi:10.1145/355841.355847

Lutz, D. R., & Hinds, C. N. (2017). High-Precision Anchored Accumulators for Reproducible Floating-Point Summation. 2017 IEEE 24th Symposium on Computer Arithmetic (ARITH). doi:10.1109/arith.2017.20

Mukunoki, D., Ogita, T., & Ozaki, K. (2020). Reproducible BLAS Routines with Tunable Accuracy Using Ozaki Scheme for Many-Core Architectures. Lecture Notes in Computer Science, 516-527. doi:10.1007/978-3-030-43229-4_44

Nguyen, H. D., & Demmel, J. (2015). Reproducible Tall-Skinny QR. 2015 IEEE 22nd Symposium on Computer Arithmetic. doi:10.1109/arith.2015.28

Ogita, T., Rump, S. M., & Oishi, S. (2005). Accurate Sum and Dot Product. SIAM Journal on Scientific Computing, 26(6), 1955-1988. doi:10.1137/030601818

Ozaki, K., Ogita, T., Oishi, S., & Rump, S. M. (2011). Error-free transformations of matrix multiplication by using fast routines of matrix multiplication and its applications. Numerical Algorithms, 59(1), 95-118. doi:10.1007/s11075-011-9478-1

Priest, D. M. (s. f.). Algorithms for arbitrary precision floating point arithmetic. [1991] Proceedings 10th IEEE Symposium on Computer Arithmetic. doi:10.1109/arith.1991.145549

Rump, S. M., Ogita, T., & Oishi, S. (2008). Accurate Floating-Point Summation Part I: Faithful Rounding. SIAM Journal on Scientific Computing, 31(1), 189-224. doi:10.1137/050645671

Rump, S. M., Ogita, T., & Oishi, S. (2009). Accurate Floating-Point Summation Part II: Sign, K-Fold Faithful and Rounding to Nearest. SIAM Journal on Scientific Computing, 31(2), 1269-1302. doi:10.1137/07068816x

Rump, S. M., Ogita, T., & Oishi, S. (2010). Fast high precision summation. Nonlinear Theory and Its Applications, IEICE, 1(1), 2-24. doi:10.1587/nolta.1.2

Saad, Y. (2003). Iterative Methods for Sparse Linear Systems. doi:10.1137/1.9780898718003

Wiesenberger, M., Einkemmer, L., Held, M., Gutierrez-Milla, A., Sáez, X., & Iakymchuk, R. (2019). Reproducibility, accuracy and performance of the Feltor code and library on parallel computer architectures. Computer Physics Communications, 238, 145-156. doi:10.1016/j.cpc.2018.12.006

[-]

recommendations

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro completo del ítem

Reproducibility of parallel preconditioned conjugate gradient in hybrid programming environments

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Buscar en RiuNet

Listar

Todo RiuNet

Esta colección

Mi cuenta

Estadísticas

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Reproducibility of parallel preconditioned conjugate gradient in hybrid programming environments

Ficheros en el ítem

Metadatos del ítem

References

recommendations

Este ítem aparece en la(s) siguiente(s) colección(ones)