Hernández Luz, Carles; Roca Pérez, Antoni; Flich Cardo, José; Silla Jiménez, Federico; Duato Marín, José Francisco(Institute of Electrical and Electronics Engineers (IEEE), 2011-12)
[EN] Recently, 3D stacking has been proposed to alleviate the memory bandwidth limitation arising in chip multiprocessors
(CMPs). As the number of integrated cores in the chip increases the access to external memory becomes ...
Ubal Tena, Rafael; Sahuquillo Borrás, Julio; Petit Martí, Salvador Vicente; López Rodríguez, Pedro Juan; Duato Marín, José Francisco(Institute of Electrical and Electronics Engineers (IEEE), 2013-05)
Multicore chips are currently dominating the microprocessor market as designs that improve performance and sustain power consumption. However, complex core features must be still considered to provide good performance for ...
Mora Porta, Gaspar(Universitat Politècnica de València, 2009-04-02)
Para beneficiarse de una reducción en la latencia así como disminuir tanto el consumo como el coste, el número óptimo de puertos de un conmutador ha ido aumentando a lo largo del tiempo. Sin embargo, las arquitecturas ...
Picornell-Sanjuan, Tomás; Flich Cardo, José; Duato Marín, José Francisco; Hernández Luz, Carles(Institute of Electrical and Electronics Engineers, 2020)
[EN] The need for increasing the performance of critical real-time embedded systems pushes the industry to adopt complex multi-core processor designs with embedded networks-on-chip. In this paper we present hp-DCFNoC, a ...
Valero Bresó, Alejandro; Sahuquillo Borrás, Julio; Lorente Garcés, Vicente Jesús; Petit Martí, Salvador Vicente; López Rodríguez, Pedro Juan; Duato Marín, José Francisco(Institute of Electrical and Electronics Engineers (IEEE), 2012-06)
[EN] Cache memories dissipate an important amount of the energy budget in current microprocessors. This is mainly due to cache cells are typically implemented with six transistors. To tackle this design concern, recent ...
Reaño González, Carlos; Silla Jiménez, Federico; Castello Gimeno, Adrián; Peña Monferrer, Antonio José; Mayo Gual, Rafael; Quintana Ortí, Enrique Salvador; Duato Marín, José Francisco(Wiley, 2015-09-25)
Graphics processing units (GPUs) are being increasingly embraced by the high-performance computing
community as an effective way to reduce execution time by accelerating parts of their applications. remote
CUDA (rCUDA) ...
Cuesta Sáez, Blas Antonio; Ros Bardisa, Alberto; Gómez Requena, María Engracia; Robles Martínez, Antonio; Duato Marín, José Francisco(Institute of Electrical and Electronics Engineers (IEEE), 2013-03)
A key aspect in the design of efficient multiprocessor systems is the cache coherence protocol. Although directory-based protocols constitute the most scalable approach, the limited size of the directory caches together ...
Reaño González, Carlos; Mayo Gual, Rafael; Quintana Ortí, Enrique Salvador; Silla Jiménez, Federico; Duato Marín, José Francisco; Peña Monferrer, Antonio José(IEEE, 2013-09-23)
The use of GPUs to accelerate general-purpose scientific and engineering applications is mainstream today, but their adoption in current high-performance computing clusters is impaired primarily by acquisition costs and ...
Feliu Pérez, Josué; Sahuquillo Borrás, Julio; Petit Martí, Salvador Vicente; Duato Marín, José Francisco(IEEE, 2013)
Improving the utilization of shared resources is a
key issue to increase performance in SMT processors. Recent
work has focused on resource sharing policies to enhance the
processor performance, but their proposals ...
Escudero, Jesús; García, Pedro J.; Quiles, Francisco J.; Flich Cardo, José; Duato Marín, José Francisco(Elsevier, 2011-11)
High-speed interconnection networks are essential elements for different high-performance parallel-computing systems. One of the most common interconnection network topologies is the fat-tree, whose advantages have turned ...
Prades Gasulla, Javier; Silla Jiménez, Federico; Fröning, Holger; Nuessle, Mondrian; Duato Marín, José Francisco(Elsevier, 2015-07)
High Performance Computing usually leverages messaging libraries such as MPI, GASNet,
or OpenSHMEM, among others, in order to exchange data among processes in large-scale
clusters. Furthermore, these libraries make use ...
Reaño González, Carlos(Universitat Politècnica de València, 2017-09-01)
Graphics Processing Units (GPUs) are being adopted in many computing facilities given their extraordinary computing power, which makes it possible to accelerate many general purpose applications from different domains. ...
Hernández Luz, Carles; Roca Pérez, Antoni; Silla Jiménez, Federico; Flich Cardo, José; Duato Marín, José Francisco(Institute of Electrical and Electronics Engineers (IEEE), 2012-02)
[EN] Current integration scales allow designing chip multiprocessors (CMP), where cores are interconnected by means of a network-on-chip (NoC). Unfortunately, the small feature size of current integration scales causes ...
Feliu-Pérez, Josué; Sahuquillo Borrás, Julio; Petit Martí, Salvador Vicente; Duato Marín, José Francisco(Institute of Electrical and Electronics Engineers, 2017)
[EN] Nowadays, high performance multicore processors implement
multithreading capabilities. The processes running concurrently on these
processors are continuously competing for the shared resources, not only among
cores, ...
March Cabrelles, José Luis; Sahuquillo Borrás, Julio; Petit Martí, Salvador Vicente; Hassan Mohamed, Houcine; Duato Marín, José Francisco(Wiley, 2013-09)
A major design issue in embedded systems is reducing the power consumption because batteries have a limited energy budget. For this purpose, several techniques such as dynamic voltage and frequency scaling (DVFS) or task ...
Ferrer Pérez, Joan Lluís; Baydal Cardona, María Elvira; Robles Martínez, Antonio; López Rodríguez, Pedro Juan; Duato Marín, José Francisco(Institute of Electrical and Electronics Engineers (IEEE), 2012-09)
Congestion management in multistage interconnection networks is a serious problem, which is not solved completely. In order to avoid the degradation of network performance when congestion appears, several congestion ...
Valls, Joan J.; Ros Bardisa, Alberto; Sahuquillo Borrás, Julio; Gómez Requena, María Engracia; Duato Marín, José Francisco(IEEE Computer Society, 2012)
As the number of cores increases in both incoming and future
chip multiprocessors, coherence protocols must address
novel hardware structures in order to scale in terms of performance,
power, and area. It is well ...
Roca Pérez, Antoni; Hernández Luz, Carles; Flich Cardo, José; Silla Jiménez, Federico; Duato Marín, José Francisco(Elsevier, 2013-08)
[EN] It is well-known that current Chip MultiProcessor (CMP) and high-end MultiProcessor System-on-Chip (MPSoC) designs are growing in their number of components. Networks-on-Chip (NoC) provide the required connectivity ...
Iserte Agut, Sergio; Castello Gimeno, Adrián; Mayo Gual, Rafael; Quintana Ortí, Enrique Salvador; Silla Jiménez, Federico; Duato Marín, José Francisco; Reaño González, Carlos; Prades Gasulla, Javier(IEEE, 2014-10-22)
SLURM is a resource manager that can be leveraged to share a collection of heterogeneous resources among the jobs in execution in a cluster. However, SLURM is not designed to handle resources such as graphics processing ...
Token Coherence is a cache coherence protocol able to simultaneously capture the best attributes of traditional protocols: low latency and scalability. However it may lose these desired features when (1) several nodes ...