Mostrar el registro sencillo del ítem
dc.contributor.advisor | Sahuquillo Borrás, Julio | es_ES |
dc.contributor.advisor | Barnes, Stuart | |
dc.contributor.author | Palacios Piqueres, David | es_ES |
dc.date.accessioned | 2011-12-21T08:23:47Z | |
dc.date.available | 2011-12-21T08:23:47Z | |
dc.date.created | 2011-09-30 | |
dc.date.issued | 2011-12-21 | |
dc.identifier.uri | http://hdl.handle.net/10251/14101 | |
dc.description.abstract | The sparse Matrix-Vector multiplication is a key operation in science and engineering along with the Conjugate Gradient method. Hence both of them are currently being studied nowadays with the purpose of increasing its performance mainly on GPU devices following the GPGPU trend (General Purpose GPU computing). This thesis presents a study of the speedup gained when performing the sparse Matrix-Vector multiplication based on the ELLR-T storage format, and a Conjugate Gradient solver that makes use of this algorithm, on different computing environments including multiple GPU cards and multiple hosts. The code implemented has been specifically designed to harness the computational architecture of the GPU by using the Nvidia CUDA (Compute Unified Device Architecture) API and the bottlenecks to its performance have been carefully analysed. The analysis shows that the bottleneck of the sparse Matrix-Vector algorithm performance, and therefore the Conjugate Gradient method, is the memory bandwidth of the computing architecture where it is executed. However, when executed on multiple GPUs and/or multiple nodes, the performance is bounded by the vector transfers between cards and nodes and the synchronization time. In fact, the multi-GPU version of the Conjugate Gradient solver presents approximately the same performance as the sequential one.The sequential Conjugate Gradient solver implemented in this thesis achieves a speedup up to 26 on a Tesla C1060 over an Intel Xeon E5462 and up to 14 on a Tesla C2050 over an Intel Core i7 X980 for matrices that represent real problems obtained from the University of Florida Sparse Matrix Collection. | es_ES |
dc.format.extent | 102 | es_ES |
dc.language | Inglés | es_ES |
dc.publisher | Universitat Politècnica de València | es_ES |
dc.rights | Reserva de todos los derechos | es_ES |
dc.subject.other | Ingeniería Informática-Enginyeria Informàtica | es_ES |
dc.title | Utilising multiple GPU cards and multiple hosts | es_ES |
dc.type | Proyecto/Trabajo fin de carrera/grado | es_ES |
dc.rights.accessRights | Cerrado | es_ES |
dc.contributor.affiliation | Universitat Politècnica de València. Escola Tècnica Superior d'Enginyeria Informàtica | es_ES |
dc.description.bibliographicCitation | Palacios Piqueres, D. (2011). Utilising multiple GPU cards and multiple hosts. http://hdl.handle.net/10251/14101. | es_ES |
dc.description.accrualMethod | Archivo delegado | es_ES |