Feliu-Pérez, J.; Naithani, A.; Sahuquillo Borrás, J.; Petit Martí, SV.; Qureshi, M.; Eeckhout, L. (2022). VMT: Virtualized Multi-Threading for Accelerating Graph Workloads on Commodity Processors. IEEE Transactions on Computers. 71(6):1386-1398. https://doi.org/10.1109/TC.2021.3086069
Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10251/194387
Título:
|
VMT: Virtualized Multi-Threading for Accelerating Graph Workloads on Commodity Processors
|
Autor:
|
Feliu-Pérez, Josué
Naithani, Ajeya
Sahuquillo Borrás, Julio
Petit Martí, Salvador Vicente
Qureshi, Moinuddin
Eeckhout, Lieven
|
Entidad UPV:
|
Universitat Politècnica de València. Departamento de Informática de Sistemas y Computadores - Departament d'Informàtica de Sistemes i Computadors
Universitat Politècnica de València. Escola Tècnica Superior d'Enginyeria Informàtica
|
Fecha difusión:
|
|
Resumen:
|
[EN] Modern-day graph workloads operate on huge graphs through pointer chasing which leads to high last-level cache (LLC) miss rates and limited memory-level parallelism (MLP). Simultaneous Multi-Threading (SMT) effectively ...[+]
[EN] Modern-day graph workloads operate on huge graphs through pointer chasing which leads to high last-level cache (LLC) miss rates and limited memory-level parallelism (MLP). Simultaneous Multi-Threading (SMT) effectively hides the memory access latencies for multi-threaded graph workloads provided that sufficient threads are supported in hardware. Unfortunately, providing a sufficiently large number of physical threads incurs an unjustifiably high hardware cost for commodity SMT processors which typically implement only two physical hardware threads. Ideally, we would like to achieve aggressive-SMT performance when running graph workloads on modest commodity processors. In this paper, we propose Virtualized Multi-Threading (VMT), a low-overhead multi-threading paradigm for accelerating graph workloads on commodity processors. Unlike prior multi-threading paradigms, VMT virtualizes both the physical hardware threads and the architecture state: VMT maps a large number of logical software threads to a small number of physical hardware threads, while maintaining the architecture state of the logical threads in the processor's cache hierarchy. Implemented on top of a quad-core 2-way SMT processor, VMT achieves an average speedup of 1.74x for a set of representative graph workloads, while incurring minimal hardware cost (195 bytes per core to support up to 32 logical threads). VMT's low hardware cost paves the way for implementation in commodity processors.
[-]
|
Palabras clave:
|
Instruction sets
,
Computer architecture
,
Hardware,Registers
,
Software,Message systems
,
Switches
,
Architecture
,
Multi-threading
,
Virtualization
,
Graph workloads
|
Derechos de uso:
|
Reserva de todos los derechos
|
Fuente:
|
IEEE Transactions on Computers. (issn:
0018-9340
)
|
DOI:
|
10.1109/TC.2021.3086069
|
Editorial:
|
Institute of Electrical and Electronics Engineers
|
Versión del editor:
|
https://doi.org/10.1109/TC.2021.3086069
|
Código del Proyecto:
|
info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/RTI2018-098156-B-C51/ES/TECNOLOGIAS INNOVADORAS DE PROCESADORES, ACELERADORES Y REDES, PARA CENTROS DE DATOS Y COMPUTACION DE ALTAS PRESTACIONES/
...[+]
info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/RTI2018-098156-B-C51/ES/TECNOLOGIAS INNOVADORAS DE PROCESADORES, ACELERADORES Y REDES, PARA CENTROS DE DATOS Y COMPUTACION DE ALTAS PRESTACIONES/
info:eu-repo/grantAgreement/ERC//741097//Load Slice Core: A Power and Cost-Efficient Microarchitecture for the Future/
info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/RTI2018-098156-B-C53/ES/TECNICAS INNOVADORAS EN COMPUTACION ESPECIALIZADA Y DE ALTAS PRESTACIONES/
info:eu-repo/grantAgreement/FWO//G.0144.17N/
info:eu-repo/grantAgreement/EC/H2020/741097/EU
info:eu-repo/grantAgreement/MCIU//FJC2018-036021-I//Ayudas Juan de la Cierva - Formación/
info:eu-repo/grantAgreement/AEI//RTI2018-098156-B-C53//TECNICAS INNOVADORAS EN COMPUTACION ESPECIALIZADA Y DE ALTAS PRESTACIONES/
info:eu-repo/grantAgreement/AEI//RTI2018-098156-B-C51//TECNOLOGIAS INNOVADORAS DE PROCESADORES, ACELERADORES Y REDES, PARA CENTROS DE DATOS Y COMPUTACION DE ALTAS PRESTACIONES/
[-]
|
Agradecimientos:
|
This work was supported in part by the Spanish MCIU and AEI, Spain, as well as European Commission FEDER funds, under grants RTI2018-098156-B-C53 and RTI2018-098156-BC51. The work of Josue Feliu was supported by a Juan de ...[+]
This work was supported in part by the Spanish MCIU and AEI, Spain, as well as European Commission FEDER funds, under grants RTI2018-098156-B-C53 and RTI2018-098156-BC51. The work of Josue Feliu was supported by a Juan de la Cierva Formacion Contract under Grant FJC2018-036021-I. The work of Lieven Eeckhout was supported in part by the European Research Council Advanced Grant agreement no. 741097 and in part by the Flanders Research Council under Grant G.0144.17N.
[-]
|
Tipo:
|
Artículo
|