Mostrar el registro sencillo del ítem
dc.contributor.author | Silla Jiménez, Federico | es_ES |
dc.contributor.author | Iserte Agut, Sergio | es_ES |
dc.contributor.author | Reaño González, Carlos | es_ES |
dc.contributor.author | Prades, Javier | es_ES |
dc.date.accessioned | 2020-10-22T03:32:32Z | |
dc.date.available | 2020-10-22T03:32:32Z | |
dc.date.issued | 2017-07-10 | es_ES |
dc.identifier.issn | 1532-0626 | es_ES |
dc.identifier.uri | http://hdl.handle.net/10251/152813 | |
dc.description.abstract | [EN] Graphics processing units (GPUs) are being adopted in many computing facilities given their extraordinary computing power, which makes it possible to accelerate many general purpose applications from different domains. However, GPUs also present several side effects, such as increased acquisition costs as well as larger space requirements. They also require more powerful energy supplies. Furthermore, GPUs still consume some amount of energy while idle, and their utilization is usually low for most workloads. In a similar way to virtual machines, the use of virtual GPUs may address the aforementioned concerns. In this regard, the remote GPU virtualization mechanism allows an application being executed in a node of the cluster to transparently use the GPUs installed at other nodes. Moreover, this technique allows to share the GPUs present in the computing facility among the applications being executed in the cluster. In this way, several applications being executed in different (or the same) cluster nodes can share 1 or more GPUs located in other nodes of the cluster. Sharing GPUs should increase overall GPU utilization, thus reducing the negative impact of the side effects mentioned before. Reducing the total amount of GPUs installed in the cluster may also be possible. In this paper, we explore some of the benefits that remote GPU virtualization brings to clusters. For instance, this mechanism allows an application to use all the GPUs present in the computing facility. Another benefit of this technique is that cluster throughput, measured as jobs completed per time unit, is noticeably increased when this technique is used. In this regard, cluster throughput can be doubled for some workloads. Furthermore, in addition to increase overall GPU utilization, total energy consumption can be reduced up to 40%. This may be key in the context of exascale computing facilities, which present an important energy constraint. Other benefits are related to the cloud computing domain, where a GPU can be easily shared among several virtual machines. Finally, GPU migration (and therefore server consolidation) is one more benefit of this novel technique. | es_ES |
dc.description.sponsorship | Generalitat Valenciana, Grant/Award Number: PROMETEOII/2013/009; MINECO and FEDER, Grant/Award Number: TIN2014-53495-R | es_ES |
dc.language | Inglés | es_ES |
dc.publisher | John Wiley & Sons | es_ES |
dc.relation.ispartof | Concurrency and Computation Practice and Experience | es_ES |
dc.rights | Reserva de todos los derechos | es_ES |
dc.subject | CUDA | es_ES |
dc.subject | GPU migration | es_ES |
dc.subject | GPU virtualization | es_ES |
dc.subject | InfiniBand | es_ES |
dc.subject | Slurm | es_ES |
dc.subject | Xen | es_ES |
dc.subject.classification | ARQUITECTURA Y TECNOLOGIA DE COMPUTADORES | es_ES |
dc.title | On the Benefits of the Remote GPU Virtualization Mechanism: the rCUDA Case | es_ES |
dc.type | Artículo | es_ES |
dc.identifier.doi | 10.1002/cpe.4072 | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/MINECO//TIN2014-53495-R/ES/COMPUTACION HETEROGENEA DE BAJO CONSUMO/ | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/GVA//PROMETEOII%2F2013%2F009/ES/DESARROLLO DE LIBRERIAS PARA GESTIONAR EL ACCESO A DISPOSITIVOS REMOTOS COMPARTIDOS EN SERVIDORES DE ALTAS PRESTACIONES/ | es_ES |
dc.rights.accessRights | Abierto | es_ES |
dc.contributor.affiliation | Universitat Politècnica de València. Departamento de Informática de Sistemas y Computadores - Departament d'Informàtica de Sistemes i Computadors | es_ES |
dc.description.bibliographicCitation | Silla Jiménez, F.; Iserte Agut, S.; Reaño González, C.; Prades, J. (2017). On the Benefits of the Remote GPU Virtualization Mechanism: the rCUDA Case. Concurrency and Computation Practice and Experience. 29(13):1-17. https://doi.org/10.1002/cpe.4072 | es_ES |
dc.description.accrualMethod | S | es_ES |
dc.relation.publisherversion | https://doi.org/10.1002/cpe.4072 | es_ES |
dc.description.upvformatpinicio | 1 | es_ES |
dc.description.upvformatpfin | 17 | es_ES |
dc.type.version | info:eu-repo/semantics/publishedVersion | es_ES |
dc.description.volume | 29 | es_ES |
dc.description.issue | 13 | es_ES |
dc.relation.pasarela | S\327017 | es_ES |
dc.contributor.funder | Generalitat Valenciana | es_ES |
dc.contributor.funder | Ministerio de Economía y Competitividad | es_ES |
dc.description.references | Wu H Diamos G Sheard T Red Fox: An execution environment for relational query processing on GPUs Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization CGO '14 Orlando, FL, USA ACM 2014 44:44 44:54 | es_ES |
dc.description.references | Playne DP Hawick KA Data parallel three-dimensional cahn-hilliard field equation simulation on GPUs with CUDA Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, PDPTA Las Vegas, Nevada, USA 2009 | es_ES |
dc.description.references | Yamazaki, I., Dong, T., Solcà, R., Tomov, S., Dongarra, J., & Schulthess, T. (2013). Tridiagonalization of a dense symmetric matrix on multiple GPUs and its application to symmetric eigenvalue problems. Concurrency and Computation: Practice and Experience, 26(16), 2652-2666. doi:10.1002/cpe.3152 | es_ES |
dc.description.references | Yuancheng Luo D Canny edge detection on NVIDIA CUDA IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2008. CVPRW '08 Anchorage, AK, USA IEEE 2008 1 8 | es_ES |
dc.description.references | Surkov, V. (2010). Parallel option pricing with Fourier space time-stepping method on graphics processing units. Parallel Computing, 36(7), 372-380. doi:10.1016/j.parco.2010.02.006 | es_ES |
dc.description.references | Agarwal, P. K., Hampton, S., Poznanovic, J., Ramanthan, A., Alam, S. R., & Crozier, P. S. (2012). Performance modeling of microsecond scale biological molecular dynamics simulations on heterogeneous architectures. Concurrency and Computation: Practice and Experience, 25(10), 1356-1375. doi:10.1002/cpe.2943 | es_ES |
dc.description.references | Yoo, A. B., Jette, M. A., & Grondona, M. (2003). SLURM: Simple Linux Utility for Resource Management. Lecture Notes in Computer Science, 44-60. doi:10.1007/10968987_3 | es_ES |
dc.description.references | Silla F Prades J Iserte S Reaño C Remote GPU virtualization: Is it useful The 2nd IEEE International Workshop on High-Performance Interconnection Networks in the Exascale and Big-Data Era Barcelona, Spain IEEE Computer Society 2016 41 48 | es_ES |
dc.description.references | Liang TY Chang YW GridCuda: A grid-enabled CUDA programming toolkit 2011 IEEE Workshops of International Conference on Advanced Information Networking and Applications (WAINA) Biopolis, Singapore IEEE 2011 141 146 | es_ES |
dc.description.references | Oikawa M Kawai A Nomura K Yasuoka K Yoshikawa K Narumi T DS-CUDA: A middleware to use many GPUs in the cloud environment Proceedings of the 2012 SC Companion: High Performance Computing, Networking Storage and Analysis SCC '12 IEEE Computer Society Washington, DC, USA 2012 1207 1214 | es_ES |
dc.description.references | Giunta G Montella R Agrillo G Coviello G A GPGPU transparent virtualization component for high performance computing clouds Euro-Par 2010 - Parallel Processing Ischia, Italy Springer 2010 | es_ES |
dc.description.references | Shi L Chen H Sun J vCUDA: GPU accelerated high performance computing in virtual machines IEEE International Symposium on Parallel & Distributed Processing, 2009. IPDPS 2009 Rome, Italy IEEE 2009 1 11 | es_ES |
dc.description.references | Gupta V Gavrilovska A Schwan K GViM: GPU-accelerated virtual machines Proceedings of the 3rd ACM Workshop on System-level Virtualization for High Performance Computing Nuremberg, Germany 2009 17 24 | es_ES |
dc.description.references | Peña, A. J., Reaño, C., Silla, F., Mayo, R., Quintana-Ortí, E. S., & Duato, J. (2014). A complete and efficient CUDA-sharing solution for HPC clusters. Parallel Computing, 40(10), 574-588. doi:10.1016/j.parco.2014.09.011 | es_ES |
dc.description.references | CUDA API Reference Manual 7.5 https://developer.nvidia.com/cuda-toolkit 2016 | es_ES |
dc.description.references | Merritt AM Gupta V Verma A Gavrilovska A Schwan K Shadowfax: Scaling in heterogeneous cluster systems via GPGPU assemblies Proceedings of the 5th International Workshop on Virtualization Technologies in Distributed Computing VTDC '11 ACM New York, NY, USA 2011 3 10 | es_ES |
dc.description.references | Shadowfax II - scalable implementation of GPGPU assemblies http://keeneland.gatech.edu/software/keeneland/kidron | es_ES |
dc.description.references | NVIDIA The NVIDIA GPU Computing SDK Version 5.5 2013 | es_ES |
dc.description.references | iperf3: A TCP, UDP, and SCTP network bandwidth measurement tool https://github.com/esnet/iperf 2016 | es_ES |
dc.description.references | Reaño C Silla F Shainer G Schultz S Local and remote GPUs perform similar with EDR 100G InfiniBand Proceedings of the Industrial Track of the 16th International Middleware Conference Middleware Industry '15 Vancouver, Canada 2015 | es_ES |
dc.description.references | Reaño, C., Silla, F., Castelló, A., Peña, A. J., Mayo, R., Quintana-Ortí, E. S., & Duato, J. (2014). Improving the user experience of the rCUDA remote GPU virtualization framework. Concurrency and Computation: Practice and Experience, 27(14), 3746-3770. doi:10.1002/cpe.3409 | es_ES |
dc.description.references | Iserte S Castelló A Mayo R Slurm support for remote GPU virtualization: Implementation and performance study 2014 IEEE 26th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD) 2014 318 325 | es_ES |
dc.description.references | Vouzis, P. D., & Sahinidis, N. V. (2010). GPU-BLAST: using graphics processors to accelerate protein sequence alignment. Bioinformatics, 27(2), 182-188. doi:10.1093/bioinformatics/btq644 | es_ES |
dc.description.references | Brown, W. M., Kohlmeyer, A., Plimpton, S. J., & Tharrington, A. N. (2012). Implementing molecular dynamics on hybrid high performance computers – Particle–particle particle-mesh. Computer Physics Communications, 183(3), 449-459. doi:10.1016/j.cpc.2011.10.012 | es_ES |
dc.description.references | Liu, Y., Schmidt, B., Liu, W., & Maskell, D. L. (2010). CUDA–MEME: Accelerating motif discovery in biological sequences using CUDA-enabled graphics processing units. Pattern Recognition Letters, 31(14), 2170-2177. doi:10.1016/j.patrec.2009.10.009 | es_ES |
dc.description.references | Pronk, S., Páll, S., Schulz, R., Larsson, P., Bjelkmar, P., Apostolov, R., … Lindahl, E. (2013). GROMACS 4.5: a high-throughput and highly parallel open source molecular simulation toolkit. Bioinformatics, 29(7), 845-854. doi:10.1093/bioinformatics/btt055 | es_ES |
dc.description.references | Klus, P., Lam, S., Lyberg, D., Cheung, M., Pullan, G., McFarlane, I., … Lam, B. Y. (2012). BarraCUDA - a fast short read sequence aligner using graphics processing units. BMC Research Notes, 5(1), 27. doi:10.1186/1756-0500-5-27 | es_ES |
dc.description.references | Kurtz, S., Phillippy, A., Delcher, A. L., Smoot, M., Shumway, M., Antonescu, C., & Salzberg, S. L. (2004). Genome Biology, 5(2), R12. doi:10.1186/gb-2004-5-2-r12 | es_ES |
dc.description.references | Chang, C.-C., & Lin, C.-J. (2011). LIBSVM. ACM Transactions on Intelligent Systems and Technology, 2(3), 1-27. doi:10.1145/1961189.1961199 | es_ES |
dc.description.references | Phillips, J. C., Braun, R., Wang, W., Gumbart, J., Tajkhorshid, E., Villa, E., … Schulten, K. (2005). Scalable molecular dynamics with NAMD. Journal of Computational Chemistry, 26(16), 1781-1802. doi:10.1002/jcc.20289 | es_ES |
dc.description.references | NVIDIA Popular GPU-Accelerated Applications Catalog http://www.nvidia.es/content/tesla/pdf/gpu-accelerated-applications-for-hpc.pdf 2016 | es_ES |
dc.description.references | Walters JP Younge AJ Kang D-I GPU-passthrough performance: A comparison of KVM, Xen, VMWare ESXi, and LXC for CUDA and OpenCL applications 7th IEEE International Conference on Cloud Computing (CLOUD 2014) Anchorage, AK, USA 2014 | es_ES |
dc.description.references | Yang C-T Wang H-Y Ou W-S Liu Y-T Hsu C-H On implementation of GPU virtualization using PCI pass-through 2012 IEEE 4th International Conference on Cloud Computing Technology and Science (CLOUDCOM) Taipei, Taiwan 2012 711 716 | es_ES |
dc.description.references | Pérez F Reaño C Silla F Providing CUDA acceleration to KVM virtual machines in InfiniBand clusters with rCUDA Proceedings of the International Conference on Distributed Applications and Interoperable Systems Crete, Greece 2016 | es_ES |
dc.description.references | Jo, H., Jeong, J., Lee, M., & Choi, D. H. (2013). Exploiting GPUs in Virtual Machine for BioCloud. BioMed Research International, 2013, 1-11. doi:10.1155/2013/939460 | es_ES |
dc.description.references | Prades J Reaño C Silla F CUDA acceleration for Xen virtual machines in Infiniband clusters with rCUDA Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming PPoPP '16 Barcelona, Spain 2016 | es_ES |
dc.description.references | Mellanox Mellanox OFED for Linux User Manual 2015 | es_ES |
dc.description.references | Liu, Y., Wirawan, A., & Schmidt, B. (2013). CUDASW++ 3.0: accelerating Smith-Waterman protein database search by coupling CPU and GPU SIMD instructions. BMC Bioinformatics, 14(1). doi:10.1186/1471-2105-14-117 | es_ES |
dc.description.references | Takizawa H Sato K Komatsu K Kobayashi H CheCUDA: A checkpoint/restart tool for CUDA applications Proceedings of the 2009 International Conference on Parallel and Distributed Computing, Applications and Technologies Hiroshima, Japan 2009 | es_ES |