- -

Predictive Reliability and Fault Management in Exascale Systems: State of the Art and Perspectives

RiuNet: Institutional repository of the Polithecnic University of Valencia

Share/Send to

Cited by

Statistics

Predictive Reliability and Fault Management in Exascale Systems: State of the Art and Perspectives

Show full item record

Canal, R.; Hernández Luz, C.; Tornero-Gavilá, R.; Cilardo, A.; Massari, G.; Reghenzani, F.; Fornaciari, W.... (2020). Predictive Reliability and Fault Management in Exascale Systems: State of the Art and Perspectives. ACM Computing Surveys. 53(5):1-32. https://doi.org/10.1145/3403956

Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10251/166962

Files in this item

Item Metadata

Title: Predictive Reliability and Fault Management in Exascale Systems: State of the Art and Perspectives
Author: Canal, Ramon Hernández Luz, Carles Tornero-Gavilá, Rafael Cilardo, Alessandro Massari, Giuseppe Reghenzani, Federico Fornaciari, William Zapater, Marina Atienza, David Oleksiak, Ariel Piatek, Wojciech Abella, Jaume
UPV Unit: Universitat Politècnica de València. Departamento de Informática de Sistemas y Computadores - Departament d'Informàtica de Sistemes i Computadors
Issued date:
Abstract:
[EN] Performance and power constraints come together with Complementary Metal Oxide Semiconductor technology scaling in future Exascale systems. Technology scaling makes each individual transistor more prone to faults and, ...[+]
Subjects: HPC , Supercomputing , Exascale , Reliability , Prediction , Survey , Faults , Failures
Copyrigths: Reserva de todos los derechos
Source:
ACM Computing Surveys. (issn: 0360-0300 )
DOI: 10.1145/3403956
Publisher:
Association for Computing Machinery
Publisher version: https://doi.org/10.1145/3403956
Project ID:
info:eu-repo/grantAgreement/EC/H2020/801137/EU/REliable power and time-ConstraInts-aware Predictive management of heterogeneous Exascale systems/
info:eu-repo/grantAgreement/MINECO//RYC-2013-14717/ES/RYC-2013-14717/
info:eu-repo/grantAgreement/MINECO//TIN2015-65316-P/ES/COMPUTACION DE ALTAS PRESTACIONES VII/
GC/2017SGR0962
Description: © ACM, 2020. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ACM Computing Surveys, Vol. 53, No. 5, Article 95. Publication date: September 2020. https://doi.org/10.1145/3403956
Thanks:
This work has received funding from the European Union's Horizon 2020 (H2020) research and innovation program under the FET-HPC Grant Agreement No. 801137 (RECIPE). Jaume Abella was also partially supported by the Ministry ...[+]
Type: Artículo

This item appears in the following Collection(s)

Show full item record