On the Benefits of the Remote GPU Virtualization Mechanism: the rCUDA Case

Silla Jiménez, Federico; Iserte Agut, Sergio; Reaño González, Carlos; Prades, Javier

doi:10.1002/cpe.4072

Identificarse

Buscar en RiuNet

Listar

Todo RiuNet
Esta colección

Mi cuenta

Acceder

Estadísticas

Ver Estadísticas de uso

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

On the Benefits of the Remote GPU Virtualization Mechanism: the rCUDA Case

Mostrar el registro sencillo del ítem

Ficheros en el ítem

Nombre: Silla;Iserte;Reaño ...

Tamaño: 2.189Mb

Formato: PDF

Descripción: Versión del Autor.

Abrir

Nombre: version_revista.pdf

Tamaño: 8.536Mb

Formato: PDF

Descripción: Versión editorial

Solicitar una copia al autor

dc.contributor.author	Silla Jiménez, Federico	es_ES
dc.contributor.author	Iserte Agut, Sergio	es_ES
dc.contributor.author	Reaño González, Carlos	es_ES
dc.contributor.author	Prades, Javier	es_ES
dc.date.accessioned	2020-10-22T03:32:32Z
dc.date.available	2020-10-22T03:32:32Z
dc.date.issued	2017-07-10	es_ES
dc.identifier.issn	1532-0626	es_ES
dc.identifier.uri	http://hdl.handle.net/10251/152813
dc.description.abstract	[EN] Graphics processing units (GPUs) are being adopted in many computing facilities given their extraordinary computing power, which makes it possible to accelerate many general purpose applications from different domains. However, GPUs also present several side effects, such as increased acquisition costs as well as larger space requirements. They also require more powerful energy supplies. Furthermore, GPUs still consume some amount of energy while idle, and their utilization is usually low for most workloads. In a similar way to virtual machines, the use of virtual GPUs may address the aforementioned concerns. In this regard, the remote GPU virtualization mechanism allows an application being executed in a node of the cluster to transparently use the GPUs installed at other nodes. Moreover, this technique allows to share the GPUs present in the computing facility among the applications being executed in the cluster. In this way, several applications being executed in different (or the same) cluster nodes can share 1 or more GPUs located in other nodes of the cluster. Sharing GPUs should increase overall GPU utilization, thus reducing the negative impact of the side effects mentioned before. Reducing the total amount of GPUs installed in the cluster may also be possible. In this paper, we explore some of the benefits that remote GPU virtualization brings to clusters. For instance, this mechanism allows an application to use all the GPUs present in the computing facility. Another benefit of this technique is that cluster throughput, measured as jobs completed per time unit, is noticeably increased when this technique is used. In this regard, cluster throughput can be doubled for some workloads. Furthermore, in addition to increase overall GPU utilization, total energy consumption can be reduced up to 40%. This may be key in the context of exascale computing facilities, which present an important energy constraint. Other benefits are related to the cloud computing domain, where a GPU can be easily shared among several virtual machines. Finally, GPU migration (and therefore server consolidation) is one more benefit of this novel technique.	es_ES
dc.description.sponsorship	Generalitat Valenciana, Grant/Award Number: PROMETEOII/2013/009; MINECO and FEDER, Grant/Award Number: TIN2014-53495-R	es_ES
dc.language	Inglés	es_ES
dc.publisher	John Wiley & Sons	es_ES
dc.relation.ispartof	Concurrency and Computation Practice and Experience	es_ES
dc.rights	Reserva de todos los derechos	es_ES
dc.subject	CUDA	es_ES
dc.subject	GPU migration	es_ES
dc.subject	GPU virtualization	es_ES
dc.subject	InfiniBand	es_ES
dc.subject	Slurm	es_ES
dc.subject	Xen	es_ES
dc.subject.classification	ARQUITECTURA Y TECNOLOGIA DE COMPUTADORES	es_ES
dc.title	On the Benefits of the Remote GPU Virtualization Mechanism: the rCUDA Case	es_ES
dc.type	Artículo	es_ES
dc.identifier.doi	10.1002/cpe.4072	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/MINECO//TIN2014-53495-R/ES/COMPUTACION HETEROGENEA DE BAJO CONSUMO/	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/GVA//PROMETEOII%2F2013%2F009/ES/DESARROLLO DE LIBRERIAS PARA GESTIONAR EL ACCESO A DISPOSITIVOS REMOTOS COMPARTIDOS EN SERVIDORES DE ALTAS PRESTACIONES/	es_ES
dc.rights.accessRights	Abierto	es_ES
dc.contributor.affiliation	Universitat Politècnica de València. Departamento de Informática de Sistemas y Computadores - Departament d'Informàtica de Sistemes i Computadors	es_ES
dc.description.bibliographicCitation	Silla Jiménez, F.; Iserte Agut, S.; Reaño González, C.; Prades, J. (2017). On the Benefits of the Remote GPU Virtualization Mechanism: the rCUDA Case. Concurrency and Computation Practice and Experience. 29(13):1-17. https://doi.org/10.1002/cpe.4072	es_ES
dc.description.accrualMethod	S	es_ES
dc.relation.publisherversion	https://doi.org/10.1002/cpe.4072	es_ES
dc.description.upvformatpinicio	1	es_ES
dc.description.upvformatpfin	17	es_ES
dc.type.version	info:eu-repo/semantics/publishedVersion	es_ES
dc.description.volume	29	es_ES
dc.description.issue	13	es_ES
dc.relation.pasarela	S\327017	es_ES
dc.contributor.funder	Generalitat Valenciana	es_ES
dc.contributor.funder	Ministerio de Economía y Competitividad	es_ES
dc.description.references	Wu H Diamos G Sheard T Red Fox: An execution environment for relational query processing on GPUs Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization CGO '14 Orlando, FL, USA ACM 2014 44:44 44:54	es_ES
dc.description.references	Playne DP Hawick KA Data parallel three-dimensional cahn-hilliard field equation simulation on GPUs with CUDA Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, PDPTA Las Vegas, Nevada, USA 2009	es_ES
dc.description.references	Yamazaki, I., Dong, T., Solcà, R., Tomov, S., Dongarra, J., & Schulthess, T. (2013). Tridiagonalization of a dense symmetric matrix on multiple GPUs and its application to symmetric eigenvalue problems. Concurrency and Computation: Practice and Experience, 26(16), 2652-2666. doi:10.1002/cpe.3152	es_ES
dc.description.references	Yuancheng Luo D Canny edge detection on NVIDIA CUDA IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2008. CVPRW '08 Anchorage, AK, USA IEEE 2008 1 8	es_ES
dc.description.references	Surkov, V. (2010). Parallel option pricing with Fourier space time-stepping method on graphics processing units. Parallel Computing, 36(7), 372-380. doi:10.1016/j.parco.2010.02.006	es_ES
dc.description.references	Agarwal, P. K., Hampton, S., Poznanovic, J., Ramanthan, A., Alam, S. R., & Crozier, P. S. (2012). Performance modeling of microsecond scale biological molecular dynamics simulations on heterogeneous architectures. Concurrency and Computation: Practice and Experience, 25(10), 1356-1375. doi:10.1002/cpe.2943	es_ES
dc.description.references	Yoo, A. B., Jette, M. A., & Grondona, M. (2003). SLURM: Simple Linux Utility for Resource Management. Lecture Notes in Computer Science, 44-60. doi:10.1007/10968987_3	es_ES
dc.description.references	Silla F Prades J Iserte S Reaño C Remote GPU virtualization: Is it useful The 2nd IEEE International Workshop on High-Performance Interconnection Networks in the Exascale and Big-Data Era Barcelona, Spain IEEE Computer Society 2016 41 48	es_ES
dc.description.references	Liang TY Chang YW GridCuda: A grid-enabled CUDA programming toolkit 2011 IEEE Workshops of International Conference on Advanced Information Networking and Applications (WAINA) Biopolis, Singapore IEEE 2011 141 146	es_ES
dc.description.references	Oikawa M Kawai A Nomura K Yasuoka K Yoshikawa K Narumi T DS-CUDA: A middleware to use many GPUs in the cloud environment Proceedings of the 2012 SC Companion: High Performance Computing, Networking Storage and Analysis SCC '12 IEEE Computer Society Washington, DC, USA 2012 1207 1214	es_ES
dc.description.references	Giunta G Montella R Agrillo G Coviello G A GPGPU transparent virtualization component for high performance computing clouds Euro-Par 2010 - Parallel Processing Ischia, Italy Springer 2010	es_ES
dc.description.references	Shi L Chen H Sun J vCUDA: GPU accelerated high performance computing in virtual machines IEEE International Symposium on Parallel & Distributed Processing, 2009. IPDPS 2009 Rome, Italy IEEE 2009 1 11	es_ES
dc.description.references	Gupta V Gavrilovska A Schwan K GViM: GPU-accelerated virtual machines Proceedings of the 3rd ACM Workshop on System-level Virtualization for High Performance Computing Nuremberg, Germany 2009 17 24	es_ES
dc.description.references	Peña, A. J., Reaño, C., Silla, F., Mayo, R., Quintana-Ortí, E. S., & Duato, J. (2014). A complete and efficient CUDA-sharing solution for HPC clusters. Parallel Computing, 40(10), 574-588. doi:10.1016/j.parco.2014.09.011	es_ES
dc.description.references	CUDA API Reference Manual 7.5 https://developer.nvidia.com/cuda-toolkit 2016	es_ES
dc.description.references	Merritt AM Gupta V Verma A Gavrilovska A Schwan K Shadowfax: Scaling in heterogeneous cluster systems via GPGPU assemblies Proceedings of the 5th International Workshop on Virtualization Technologies in Distributed Computing VTDC '11 ACM New York, NY, USA 2011 3 10	es_ES
dc.description.references	Shadowfax II - scalable implementation of GPGPU assemblies http://keeneland.gatech.edu/software/keeneland/kidron	es_ES
dc.description.references	NVIDIA The NVIDIA GPU Computing SDK Version 5.5 2013	es_ES
dc.description.references	iperf3: A TCP, UDP, and SCTP network bandwidth measurement tool https://github.com/esnet/iperf 2016	es_ES
dc.description.references	Reaño C Silla F Shainer G Schultz S Local and remote GPUs perform similar with EDR 100G InfiniBand Proceedings of the Industrial Track of the 16th International Middleware Conference Middleware Industry '15 Vancouver, Canada 2015	es_ES
dc.description.references	Reaño, C., Silla, F., Castelló, A., Peña, A. J., Mayo, R., Quintana-Ortí, E. S., & Duato, J. (2014). Improving the user experience of the rCUDA remote GPU virtualization framework. Concurrency and Computation: Practice and Experience, 27(14), 3746-3770. doi:10.1002/cpe.3409	es_ES
dc.description.references	Iserte S Castelló A Mayo R Slurm support for remote GPU virtualization: Implementation and performance study 2014 IEEE 26th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD) 2014 318 325	es_ES
dc.description.references	Vouzis, P. D., & Sahinidis, N. V. (2010). GPU-BLAST: using graphics processors to accelerate protein sequence alignment. Bioinformatics, 27(2), 182-188. doi:10.1093/bioinformatics/btq644	es_ES
dc.description.references	Brown, W. M., Kohlmeyer, A., Plimpton, S. J., & Tharrington, A. N. (2012). Implementing molecular dynamics on hybrid high performance computers – Particle–particle particle-mesh. Computer Physics Communications, 183(3), 449-459. doi:10.1016/j.cpc.2011.10.012	es_ES
dc.description.references	Liu, Y., Schmidt, B., Liu, W., & Maskell, D. L. (2010). CUDA–MEME: Accelerating motif discovery in biological sequences using CUDA-enabled graphics processing units. Pattern Recognition Letters, 31(14), 2170-2177. doi:10.1016/j.patrec.2009.10.009	es_ES
dc.description.references	Pronk, S., Páll, S., Schulz, R., Larsson, P., Bjelkmar, P., Apostolov, R., … Lindahl, E. (2013). GROMACS 4.5: a high-throughput and highly parallel open source molecular simulation toolkit. Bioinformatics, 29(7), 845-854. doi:10.1093/bioinformatics/btt055	es_ES
dc.description.references	Klus, P., Lam, S., Lyberg, D., Cheung, M., Pullan, G., McFarlane, I., … Lam, B. Y. (2012). BarraCUDA - a fast short read sequence aligner using graphics processing units. BMC Research Notes, 5(1), 27. doi:10.1186/1756-0500-5-27	es_ES
dc.description.references	Kurtz, S., Phillippy, A., Delcher, A. L., Smoot, M., Shumway, M., Antonescu, C., & Salzberg, S. L. (2004). Genome Biology, 5(2), R12. doi:10.1186/gb-2004-5-2-r12	es_ES
dc.description.references	Chang, C.-C., & Lin, C.-J. (2011). LIBSVM. ACM Transactions on Intelligent Systems and Technology, 2(3), 1-27. doi:10.1145/1961189.1961199	es_ES
dc.description.references	Phillips, J. C., Braun, R., Wang, W., Gumbart, J., Tajkhorshid, E., Villa, E., … Schulten, K. (2005). Scalable molecular dynamics with NAMD. Journal of Computational Chemistry, 26(16), 1781-1802. doi:10.1002/jcc.20289	es_ES
dc.description.references	NVIDIA Popular GPU-Accelerated Applications Catalog http://www.nvidia.es/content/tesla/pdf/gpu-accelerated-applications-for-hpc.pdf 2016	es_ES
dc.description.references	Walters JP Younge AJ Kang D-I GPU-passthrough performance: A comparison of KVM, Xen, VMWare ESXi, and LXC for CUDA and OpenCL applications 7th IEEE International Conference on Cloud Computing (CLOUD 2014) Anchorage, AK, USA 2014	es_ES
dc.description.references	Yang C-T Wang H-Y Ou W-S Liu Y-T Hsu C-H On implementation of GPU virtualization using PCI pass-through 2012 IEEE 4th International Conference on Cloud Computing Technology and Science (CLOUDCOM) Taipei, Taiwan 2012 711 716	es_ES
dc.description.references	Pérez F Reaño C Silla F Providing CUDA acceleration to KVM virtual machines in InfiniBand clusters with rCUDA Proceedings of the International Conference on Distributed Applications and Interoperable Systems Crete, Greece 2016	es_ES
dc.description.references	Jo, H., Jeong, J., Lee, M., & Choi, D. H. (2013). Exploiting GPUs in Virtual Machine for BioCloud. BioMed Research International, 2013, 1-11. doi:10.1155/2013/939460	es_ES
dc.description.references	Prades J Reaño C Silla F CUDA acceleration for Xen virtual machines in Infiniband clusters with rCUDA Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming PPoPP '16 Barcelona, Spain 2016	es_ES
dc.description.references	Mellanox Mellanox OFED for Linux User Manual 2015	es_ES
dc.description.references	Liu, Y., Wirawan, A., & Schmidt, B. (2013). CUDASW++ 3.0: accelerating Smith-Waterman protein database search by coupling CPU and GPU SIMD instructions. BMC Bioinformatics, 14(1). doi:10.1186/1471-2105-14-117	es_ES
dc.description.references	Takizawa H Sato K Komatsu K Kobayashi H CheCUDA: A checkpoint/restart tool for CUDA applications Proceedings of the 2009 International Conference on Parallel and Distributed Computing, Applications and Technologies Hiroshima, Japan 2009	es_ES

Este ítem aparece en la(s) siguiente(s) colección(ones)

Artículos, conferencias, monografías [48357]

Mostrar el registro sencillo del ítem

On the Benefits of the Remote GPU Virtualization Mechanism: the rCUDA Case

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Buscar en RiuNet

Listar

Todo RiuNet

Esta colección

Mi cuenta

Estadísticas

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

On the Benefits of the Remote GPU Virtualization Mechanism: the rCUDA Case

Ficheros en el ítem

Este ítem aparece en la(s) siguiente(s) colección(ones)