A cluster computer performance predictor for memory scheduling

Serrano Gómez, Mónica; Sahuquillo Borrás, Julio; Hassan Mohamed, Houcine; Petit Martí, Salvador Vicente; Duato Marín, José Francisco

doi:10.1007/978-3-642-24669-2_34

Identificarse

Buscar en RiuNet

Listar

Todo RiuNet
Esta colección

Mi cuenta

Acceder

Estadísticas

Ver Estadísticas de uso

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

A cluster computer performance predictor for memory scheduling

Mostrar el registro sencillo del ítem

Ficheros en el ítem

Nombre: A Cluster Computer ...

Tamaño: 144.1Kb

Formato: PDF

Descripción: Versión del Autor.

Abrir

Nombre: Serrano;Sahuquill ...

Tamaño: 152.0Kb

Formato: PDF

Descripción: Versión editorial

Solicitar una copia al autor

dc.contributor.author	Serrano Gómez, Mónica	es_ES
dc.contributor.author	Sahuquillo Borrás, Julio	es_ES
dc.contributor.author	Hassan Mohamed, Houcine	es_ES
dc.contributor.author	Petit Martí, Salvador Vicente	es_ES
dc.contributor.author	Duato Marín, José Francisco	es_ES
dc.date.accessioned	2014-03-04T11:44:37Z
dc.date.issued	2011
dc.identifier.isbn	978-3-642-24668-5
dc.identifier.issn	0302-9743
dc.identifier.uri	http://hdl.handle.net/10251/36140
dc.description.abstract	Remote Memory Access (RMA) hardware allow a given motherboard in a cluster to directly access the memory installed in a remote motherboard of the same cluster. In recent works, this characteristic has been used to extend the addressable memory space of selected motherboards, which enable a better balance of main memory resources among cluster applications. This way is much more cost-effective than than implementing a full-fledged shared memory system. In this context, the memory scheduler is in charge of finding a suitable distribution of local and remote memory that maximizes the performance and guarantees a minimum QoS among the applications. Note that since changing the memory distribution is a slow process involving several motherboards, the memory scheduler needs to make sure that the target distribution provides better performance than the current one. In this paper, a performance predictor is designed in order to find the best memory distribution for a given set of applications executing in a cluster motherboard. The predictor uses simple hardware counters to estimate the expected impact on performance of the different memory distributions. The hardware counters provide the predictor with the information about the time spent in processor, memory access and network. The performance model used by the predictor has been validated in a detailed microarchitectural simulator using real benchmarks. Results show that the prediction accuracy never deviates more than 5% compared to the real results, being less than 0.5% in most of the cases.	es_ES
dc.description.sponsorship	This work was supported by Spanish CICYT under Grant TIN2009-14475-C04-01, and by Consolider-Ingenio under Grant CSD2006-00046
dc.format.extent	10	es_ES
dc.language	Inglés	es_ES
dc.publisher	Springer Verlag (Germany)	es_ES
dc.relation.ispartof	Algorithms and Architectures for Parallel Processing	es_ES
dc.relation.ispartofseries	Lecture Notes in Computer Science;vol. 7017
dc.rights	Reserva de todos los derechos	es_ES
dc.subject	Cluster computers	es_ES
dc.subject	Memory scheduling	es_ES
dc.subject	Remote memory assignment	es_ES
dc.subject	Performance estimation	es_ES
dc.subject.classification	ARQUITECTURA Y TECNOLOGIA DE COMPUTADORES	es_ES
dc.title	A cluster computer performance predictor for memory scheduling	es_ES
dc.type	Capítulo de libro	es_ES
dc.embargo.lift	10000-01-01
dc.embargo.terms	forever	es_ES
dc.identifier.doi	10.1007/978-3-642-24669-2_34
dc.relation.projectID	info:eu-repo/grantAgreement/MICINN//TIN2009-14475-C04-01/ES/Arquitecturas De Servidores, Aplicaciones Y Servicios/	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/MEC//CSD2006-00046/ES/Arquitecturas fiables y de altas prestaciones para centros de proceso de datos y servidores de Internet/	es_ES
dc.rights.accessRights	Abierto	es_ES
dc.contributor.affiliation	Universitat Politècnica de València. Departamento de Informática de Sistemas y Computadores - Departament d'Informàtica de Sistemes i Computadors	es_ES
dc.description.bibliographicCitation	Serrano Gómez, M.; Sahuquillo Borrás, J.; Hassan Mohamed, H.; Petit Martí, SV.; Duato Marín, JF. (2011). A cluster computer performance predictor for memory scheduling. En Algorithms and Architectures for Parallel Processing. Springer Verlag (Germany). 7017:353-362. https://doi.org/10.1007/978-3-642-24669-2_34	es_ES
dc.description.accrualMethod	S	es_ES
dc.relation.conferencename	11th International Conference, ICA300 2011	es_ES
dc.relation.conferencedate	October 24-26, 2011	es_ES
dc.relation.conferenceplace	Melbourne, Australia	es_ES
dc.relation.publisherversion	http://link.springer.com/chapter/10.1007/978-3-642-24669-2_34	es_ES
dc.description.upvformatpinicio	353	es_ES
dc.description.upvformatpfin	362	es_ES
dc.type.version	info:eu-repo/semantics/publishedVersion	es_ES
dc.description.volume	7017	es_ES
dc.relation.senia	221014
dc.contributor.funder	Ministerio de Ciencia e Innovación
dc.contributor.funder	Ministerio de Educación y Ciencia	es_ES
dc.description.references	Meuer, H.W.: The top500 project: Looking back over 15 years of supercomputing experience. Informatik-Spektrum 31, 203–222 (2008), doi:10.1007/s00287-008-0240-6	es_ES
dc.description.references	Nussle, M., Scherer, M., Bruning, U.: A Resource Optimized Remote-Memory-Access Architecture for Low-latency Communication. In: International Conference on Parallel Processing, pp. 220–227 (September 2009)	es_ES
dc.description.references	Blocksome, M., Archer, C., Inglett, T., McCarthy, P., Mundy, M., Ratterman, J., Sidelnik, A., Smith, B., Almási, G., Castaños, J., Lieber, D., Moreira, J., Krishnamoorthy, S., Tipparaju, V., Nieplocha, J.: Design and implementation of a one-sided communication interface for the IBM eServer Blue Gene®supercomputer. In: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, p. 120. ACM, New York (2006)	es_ES
dc.description.references	Kumar, S., Dózsa, G., Almasi, G., Heidelberger, P., Chen, D., Giampapa, M., Blocksome, M., Faraj, A., Parker, J., Ratterman, J., Smith, B.E., Archer, C.: The deep computing messaging framework: generalized scalable message passing on the blue gene/P supercomputer. In: ICS, pp. 94–103 (2008)	es_ES
dc.description.references	Tipparaju, V., Kot, A., Nieplocha, J., Bruggencate, M.T., Chrisochoides, N.: Evaluation of Remote Memory Access Communication on the Cray XT3. In: IEEE International Parallel and Distributed Processing Symposium, pp. 1–7 (March 2007)	es_ES
dc.description.references	HyperTransport Technology Consortium. HyperTransport I/O Link Specification Revision (October 3, 2008)	es_ES
dc.description.references	Serrano, M., Sahuquillo, J., Hassan, H., Petit, S., Duato, J.: A scheduling heuristic to handle local and remote memory in cluster computers. In: High Performance Computing and Communications (2010) (accepted for publication)	es_ES
dc.description.references	Serrano, M., Sahuquillo, J., Petit, S., Hassan, H., Duato, J.: A cost-effective heuristic to schedule local and remote memory in cluster computers. The Journal of Supercomputing, 1–19 (2011), doi:10.1007/s11227-011-0566-8	es_ES
dc.description.references	Ubal, R., Sahuquillo, J., Petit, S., López, P.: Multi2Sim: A Simulation Framework to Evaluate Multicore-Multithreaded Processors. In: Proceedings of the 19th International Symposium on Computer Architecture and High Performance Computing (2007)	es_ES
dc.description.references	Keltcher, C.N., McGrath, K.J., Ahmed, A., Conway, P.: The AMD Opteron Processor for Multiprocessor Servers. IEEE Micro 23(2), 66–76 (2003)	es_ES
dc.description.references	Duato, J., Silla, F., Yalamanchili, S.: Extending HyperTransport Protocol for Improved Scalability. In: First International Workshop on HyperTransport Research and Applications (2009)	es_ES
dc.description.references	Litz, H., Fröening, H., Nuessle, M., Brüening, U.: A HyperTransport Network Interface Controller for Ultra-low Latency Message Transfers. In: HyperTransport Consortium White Paper (2007)	es_ES
dc.description.references	Zhuravlev, S., Blagodurov, S., Fedorova, A.: Addressing shared resource contention in multicore processors via scheduling. In: Proceedings of the 15th International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 129–142 (2010)	es_ES
dc.description.references	Xie, Y., Loh, G.H.: Dynamic Classification of Program Memory Behaviors in CMPs. In: 2nd Workshop on Chip Multiprocessor Memory Systems and Interconnects in conjunction with the 35th International Symposium on Computer Architecture (2008)	es_ES
dc.description.references	Xu, C., Chen, X., Dick, R.P., Mao, Z.M.: Cache contention and application performance prediction for multi-core systems. In: IEEE International Symposium on Performance Analysis of Systems and Software, pp. 76–86 (2010)	es_ES
dc.description.references	Rai, J.K., Negi, A., Wankar, R., Nayak, K.D.: Performance prediction on multi-core processors. In: 2010 International Conference on Computational Intelligence and Communication Networks (CICN), pp. 633–637 (November 2010)	es_ES
dc.description.references	Liang, S., Noronha, R., Panda, D.K.: Swapping to Remote Memory over InfiniBand: An Approach using a High Performance Network Block Device. In: CLUSTER, pp. 1–10. IEEE, Los Alamitos (2005)	es_ES
dc.description.references	Werstein, P., Jia, X., Huang, Z.: A Remote Memory Swapping System for Cluster Computers. In: Eighth International Conference on Parallel and Distributed Computing, Applications and Technologies, pp. 75–81 (2007)	es_ES
dc.description.references	Midorikawa, H., Kurokawa, M., Himeno, R., Sato, M.: DLM: A distributed Large Memory System using remote memory swapping over cluster nodes. In: IEEE International Conference on Cluster Computing, pp. 268–273 (October 2008)	es_ES

Este ítem aparece en la(s) siguiente(s) colección(ones)

Artículos, conferencias, monografías [48344]

Mostrar el registro sencillo del ítem

A cluster computer performance predictor for memory scheduling

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Buscar en RiuNet

Listar

Todo RiuNet

Esta colección

Mi cuenta

Estadísticas

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

A cluster computer performance predictor for memory scheduling

Ficheros en el ítem

Este ítem aparece en la(s) siguiente(s) colección(ones)