Feliu Pérez, J.; Sahuquillo Borrás, J.; Petit Martí, SV.; Duato Marín, JF. (2012). Understanding cache hierarchy contention in CMPs to improve job scheduling. IEEE. https://doi.org/10.1109/IPDPS.2012.54
Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10251/67537
Título:
|
Understanding cache hierarchy contention in CMPs to improve job scheduling
|
Autor:
|
Feliu Pérez, Josué
Sahuquillo Borrás, Julio
Petit Martí, Salvador Vicente
Duato Marín, José Francisco
|
Entidad UPV:
|
Universitat Politècnica de València. Departamento de Informática de Sistemas y Computadores - Departament d'Informàtica de Sistemes i Computadors
|
Fecha difusión:
|
|
Resumen:
|
In order to improve CMP performance, recent research has focused on scheduling to mitigate contention produced by the limited memory bandwidth. Nowadays, commercial CMPs implement multi-level cache hierarchies where last ...[+]
In order to improve CMP performance, recent research has focused on scheduling to mitigate contention produced by the limited memory bandwidth. Nowadays, commercial CMPs implement multi-level cache hierarchies where last level caches are shared by at least two cache structures located at the immediately lower cache level. In turn, these caches can be shared by several multithreaded cores. In this microprocessor design, contention points may appear along the whole memory hierarchy. Moreover, this problem is expected to aggravate in future technologies, since the number of cores and hardware threads, and consequently the size of the shared caches increases with each microprocessor generation. In this paper we characterize the impact on performance of the different contention points that appear along the memory subsystem. Then, we propose a generic scheduling strategy for CMPs that takes into account the available bandwidth at each level of the cache hierarchy. The proposed strategy selects the processes to be co-scheduled and allocates them to cores in order to minimize contention effects. The proposal has been implemented and evaluated in a commercial single-threaded quad-core processor with a relatively small two-level cache hierarchy. Despite these potential contention limitations are less than in recent processor designs, compared to the Linux scheduler, the proposal reaches performance improvements up to 9% while these benefits (across the studied benchmark mixes) are always lower than 6% for a memory-aware scheduler that does not take into account the cache hierarchy. Moreover, in some cases the proposal doubles the speedup achieved by the memory-aware scheduler.
[-]
|
Palabras clave:
|
Memory-aware scheduling
,
Contention-points
,
Shared caches
,
Cache hierarchy
|
Derechos de uso:
|
Reserva de todos los derechos
|
ISBN:
|
978-0-7695-4675-9
|
Fuente:
|
|
DOI:
|
10.1109/IPDPS.2012.54
|
Editorial:
|
IEEE
|
Versión del editor:
|
http://dx.doi.org/10.1109/IPDPS.2012.54
|
Título del congreso:
|
26th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2012)
|
Lugar del congreso:
|
Shanghai, China
|
Fecha congreso:
|
May 21-25, 2012
|
Código del Proyecto:
|
info:eu-repo/grantAgreement/MICINN//TIN2009-14475-C04-01/ES/Arquitecturas De Servidores, Aplicaciones Y Servicios/
info:eu-repo/grantAgreement/MEC//CSD2006-00046/ES/Arquitecturas fiables y de altas prestaciones para centros de proceso de datos y servidores de Internet/
|
Descripción:
|
© 2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
|
Agradecimientos:
|
This work was supported by the Spanish MICINN, Consolider Programme and Plan E funds, as well as European Commission FEDER funds, under Grants CSD2006-00046 and TIN2009-14475-C04-01.
|
Tipo:
|
Comunicación en congreso
|