Reproducibility strategies for parallel preconditioned Conjugate Gradient

Iakymchuk, Roman; Barreda, María; Wiesenberger, Matthias; Aliaga, José I.; Quintana Ortí, Enrique Salvador

doi:10.1016/j.cam.2019.112697

Identificarse

Buscar en RiuNet

Listar

Todo RiuNet
Esta colección

Mi cuenta

Acceder

Estadísticas

Ver Estadísticas de uso

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Reproducibility strategies for parallel preconditioned Conjugate Gradient

Mostrar el registro completo del ítem

Iakymchuk, R.; Barreda, M.; Wiesenberger, M.; Aliaga, JI.; Quintana Ortí, ES. (2020). Reproducibility strategies for parallel preconditioned Conjugate Gradient. Journal of Computational and Applied Mathematics. 371:1-13. https://doi.org/10.1016/j.cam.2019.112697

Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10251/170081

Ficheros en el ítem

Nombre: IakymchukBarredaW ...

Tamaño: 308.5Kb

Formato: PDF

Descripción: Versión del Autor.

Abrir/Preview

Nombre: 1-s2.0-S037704271 ...

Tamaño: 491.5Kb

Formato: PDF

Descripción: Versión editorial

Solicitar una copia al autor

Metadatos del ítem

Título:

Reproducibility strategies for parallel preconditioned Conjugate Gradient

Autor:

Iakymchuk, Roman Barreda, María Wiesenberger, Matthias Aliaga, José I.

Quintana Ortí, Enrique Salvador

Entidad UPV:

Universitat Politècnica de València. Departamento de Informática de Sistemas y Computadores - Departament d'Informàtica de Sistemes i Computadors

Fecha difusión:

2020-06

Resumen:

[EN] The Preconditioned Conjugate Gradient method is often used in numerical simulations. While being widely used, the solver is also known for its lack of accuracy while computing the residual. In this article, we aim at ...[+]

Palabras clave:

Reproducibility , Accuracy , Floating-point expansion , Long accumulator , Preconditioned Conjugate Gradient , High-Performance Computing

Derechos de uso:

Reconocimiento - No comercial - Sin obra derivada (by-nc-nd)

Fuente:

Journal of Computational and Applied Mathematics. (issn: 0377-0427 )

DOI:

10.1016/j.cam.2019.112697

Editorial:

Elsevier

Versión del editor:

https://doi.org/10.1016/j.cam.2019.112697

Código del Proyecto:

info:eu-repo/grantAgreement/EC/H2020/730897/EU/Transnational Access Programme for a Pan-European Network of HPC Research Infrastructures and Laboratories for scientific computing/
info:eu-repo/grantAgreement/UJI//POSDOC-A%2F2017%2F11/
info:eu-repo/grantAgreement/EC/H2020/842528/EU/Robust and Energy-Efficient Numerical Solvers Towards Reliable and Sustainable Scientific Computations/
info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2013-2016/TIN2017-82972-R/ES/TECNICAS ALGORITMICAS PARA COMPUTACION DE ALTO RENDIMIENTO CONSCIENTE DEL CONSUMO ENERGETICO Y RESISTENTE A ERRORES/

Agradecimientos:

To begin with, we would like to thank the reviewers for their thorough reading of the article as well as their valuable comments and suggestions. This research was partially supported by the European Union's Horizon 2020 ...[+]

Tipo:

Artículo

References

Lawson, C. L., Hanson, R. J., Kincaid, D. R., & Krogh, F. T. (1979). Basic Linear Algebra Subprograms for Fortran Usage. ACM Transactions on Mathematical Software, 5(3), 308-323. doi:10.1145/355841.355847

Dongarra, J. J., Du Croz, J., Hammarling, S., & Duff, I. S. (1990). A set of level 3 basic linear algebra subprograms. ACM Transactions on Mathematical Software, 16(1), 1-17. doi:10.1145/77626.79170

Demmel, J., & Nguyen, H. D. (2015). Parallel Reproducible Summation. IEEE Transactions on Computers, 64(7), 2060-2070. doi:10.1109/tc.2014.2345391

Iakymchuk, R., Graillat, S., Defour, D., & Quintana-Ortí, E. S. (2019). Hierarchical approach for deriving a reproducible unblocked LU factorization. The International Journal of High Performance Computing Applications, 33(5), 791-803. doi:10.1177/1094342019832968

Iakymchuk, R., Defour, D., Collange, S., & Graillat, S. (2016). Reproducible and Accurate Matrix Multiplication. Lecture Notes in Computer Science, 126-137. doi:10.1007/978-3-319-31769-4_11

Rump, S. M., Ogita, T., & Oishi, S. (2009). Accurate Floating-Point Summation Part II: Sign, K-Fold Faithful and Rounding to Nearest. SIAM Journal on Scientific Computing, 31(2), 1269-1302. doi:10.1137/07068816x

Burgess, N., Goodyer, C., Hinds, C. N., & Lutz, D. R. (2019). High-Precision Anchored Accumulators for Reproducible Floating-Point Summation. IEEE Transactions on Computers, 68(7), 967-978. doi:10.1109/tc.2018.2855729

D. Mukunoki, T. Ogita, K. Ozaki, Accurate and reproducible BLAS routines with Ozaki scheme for many-core architectures, in: Proc. International Conference on Parallel Processing and Applied Mathematics, PPAM2019, 2019, accepted.

Ogita, T., Rump, S. M., & Oishi, S. (2005). Accurate Sum and Dot Product. SIAM Journal on Scientific Computing, 26(6), 1955-1988. doi:10.1137/030601818

Kulisch, U., & Snyder, V. (2010). The exact dot product as basic tool for long interval arithmetic. Computing, 91(3), 307-313. doi:10.1007/s00607-010-0127-7

Boldo, S., & Melquiond, G. (2008). Emulation of a FMA and Correctly Rounded Sums: Proved Algorithms Using Rounding to Odd. IEEE Transactions on Computers, 57(4), 462-471. doi:10.1109/tc.2007.70819

Wiesenberger, M., Einkemmer, L., Held, M., Gutierrez-Milla, A., Sáez, X., & Iakymchuk, R. (2019). Reproducibility, accuracy and performance of the Feltor code and library on parallel computer architectures. Computer Physics Communications, 238, 145-156. doi:10.1016/j.cpc.2018.12.006

Fousse, L., Hanrot, G., Lefèvre, V., Pélissier, P., & Zimmermann, P. (2007). MPFR. ACM Transactions on Mathematical Software, 33(2), 13. doi:10.1145/1236463.1236468

J. Demmel, H.D. Nguyen, Fast reproducible floating-point summation, in: Proceedings of ARITH-21, 2013, pp. 163–172.

Ozaki, K., Ogita, T., Oishi, S., & Rump, S. M. (2011). Error-free transformations of matrix multiplication by using fast routines of matrix multiplication and its applications. Numerical Algorithms, 59(1), 95-118. doi:10.1007/s11075-011-9478-1

Carson, E., & Higham, N. J. (2018). Accelerating the Solution of Linear Systems by Iterative Refinement in Three Precisions. SIAM Journal on Scientific Computing, 40(2), A817-A847. doi:10.1137/17m1140819

[-]

recommendations

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro completo del ítem

Reproducibility strategies for parallel preconditioned Conjugate Gradient

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Buscar en RiuNet

Listar

Todo RiuNet

Esta colección

Mi cuenta

Estadísticas

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Reproducibility strategies for parallel preconditioned Conjugate Gradient

Ficheros en el ítem

Metadatos del ítem

References

recommendations

Este ítem aparece en la(s) siguiente(s) colección(ones)