Strohmaier E, Dongarra J, Simon H, Meuer M (2015) TOP500 supercomputing sites. http://www.top500.org/lists/2015/11 . Accessed Nov 2015
NVIDIA (2015) CUDA API reference, version 7.5
Shreiner D, Sellers G, Kessenich JM, Licea-Kane BM (2013) OpenGL programming guide: the official guide to learning OpenGL. Addison-Wesley Professional, Boston
[+]
Strohmaier E, Dongarra J, Simon H, Meuer M (2015) TOP500 supercomputing sites. http://www.top500.org/lists/2015/11 . Accessed Nov 2015
NVIDIA (2015) CUDA API reference, version 7.5
Shreiner D, Sellers G, Kessenich JM, Licea-Kane BM (2013) OpenGL programming guide: the official guide to learning OpenGL. Addison-Wesley Professional, Boston
Mark WR, Glanville RS, Akeley K, Kilgard MJ (2003) Cg: a system for programming graphics hardware in a C-like language. ACM Trans Graph (TOG) 22(3):896–907
Munshi A (2014)The OpenCL specification 2.0. 0.5em minus 0.4em Khronos OpenCL working group
OpenACC directives for accelerators (2015). http://www.openacc-standard.org . Accessed Dec 2015
OmpSs project home page. http://pm.bsc.es/ompss . Accessed Dec 2015
OpenMP application program interface 4.0 (2013). OpenMP Architecture Board
Peña AJ (2013) Virtualization of accelerators in high performance clusters. Ph.D. dissertation, Universitat Jaume I, Castellón
Kawai A, Yasuoka K, Yoshikawa K, Narumi T (2012) Distributed-shared CUDA: virtualization of large-scale GPU systems for programmability and reliability. In: International conference on future computational technologies and applications
Shi L, Chen H, Sun J, Li K (2012) vCUDA: GPU-accelerated high-performance computing in virtual machines. IEEE Trans Comput 61(6):804–816
Xiao S, Balaji P, Zhu Q, Thakur R, Coghlan S, Lin H, Wen G, Hong J, Feng W (2012) VOCL: an optimized environment for transparent virtualization of graphics processing units. In: Innovative parallel computing. IEEE, New York
Kim J, Seo S, Lee J, Nah J, Jo G, Lee J (2012) SnuCL: an OpenCL framework for heterogeneous CPU/GPU clusters. In: International conference on supercomputing
Duran A, Ayguadé E, Badia RM, Labarta J, Martinell L, Martorell X, Planas J (2011) OmpSs: a proposal for programming heterogeneous multi-core architectures. Parallel Process Lett 21(02):173–193
Castelló A, Duato J, Mayo R, Peña AJ, Quintana-Ortí ES, Roca V, Silla F (2014) On the use of remote GPUs and low-power processors for the acceleration of scientific applications. In: The fourth international conference on smart grids, green communications and IT energy-aware technologies, pp 57–62
Iserte S, Castelló A, Mayo R, Quintana-Ortí ES, Reaño C, Prades J, Silla F, Duato J (2014) SLURM support for remote GPU virtualization: implementation and performance study. In: International symposium on computer architecture and high performance computing (SBAC-PAD)
Peña AJ, Reaño C, Silla F, Mayo R, Quintana-Ortí ES, Duato J (2014) A complete and efficient CUDA-sharing solution for HPC clusters. Parallel Comput 40(10):574–588
Kegel P, Steuwer M, Gorlatch S (2012) dOpenCL: towards a uniform programming approach for distributed heterogeneous multi-/many-core systems. In: International parallel and distributed processing symposium workshops (IPDPSW)
Castelló A, Peña AJ, Mayo R, Balaji P, Quintana-Ortí ES (2015) Exploring the suitability of remote GPGPU virtualization for the OpenACC programming model using rCUDA. In: IEEE international conference on cluster computing
Castelló A, Mayo R, Planas J, Quintana-Ortí ES (2015) Exploiting task-parallelism on GPU clusters via OmpSs and rCUDA virtualization. In: IEEE international workshop on reengineering for parallelism in heterogeneous parallel platforms
HP Corp., Intel Corp., Microsoft Corp., Phoenix Tech. Ltd., Toshiba Corp. (2011) Advanced configuration and power interface specification, revision 5.0
Reaño C, Silla F, Castelló A, Peña AJ, Mayo R, Quintana-Ortí ES, Duato J (2014) Improving the user experience of the rCUDA remote GPU virtualization framework. Concurr Comput 27(14):3746–3770
PGI compilers and tools (2015) http://www.pgroup.com/ . Accessed Dec 2015
Johnson N (2013) EPCC OpenACC benchmark suite. https://www.epcc.ed.ac.uk/ . Accessed Dec 2015
Herdman J, Gaudin W, McIntosh-Smith S, Boulton M, Beckingsale D, Mallinson A, Jarvis SA (2012) Accelerating hydrocodes with OpenACC, OpenCL and CUDA. In: SC companion: high performance computing, networking, storage and analysis
[-]