Mostrar el registro sencillo del ítem
dc.contributor.author | Puche, José | es_ES |
dc.contributor.author | Petit Martí, Salvador Vicente | es_ES |
dc.contributor.author | Gómez Requena, María Engracia | es_ES |
dc.contributor.author | Sahuquillo Borrás, Julio | es_ES |
dc.date.accessioned | 2021-06-30T03:31:04Z | |
dc.date.available | 2021-06-30T03:31:04Z | |
dc.date.issued | 2020-09 | es_ES |
dc.identifier.issn | 0167-739X | es_ES |
dc.identifier.uri | http://hdl.handle.net/10251/168546 | |
dc.description.abstract | [EN] The cache hierarchy of current multicores typically consists of three levels, ranging from the faster and smaller L1 level to the slower and larger L3 level. This approach has been demonstrated to be effective in high performance processors, since it reduces the average memory access time. However, when implemented in devices where energy efficiency becomes critical, like low power or embedded processors, conventional cache hierarchies may present some concerns. These concerns, which incur a waste of area and energy, are multiple cache lookups, block replication, block migration and private cache space overprovisioning. To deal with these issues, in this work we propose FOS-Mt, a new cache organization aimed at addressing energy savings in current multicores for multithreaded applications. FOS-Mt's cache hierarchy consists of only two levels: the L1 cache level located in the core pipeline, and a single and flattened second level which conforms an aggregated cache space which is accessible by all the execution cores. This level is sliced into multiple small buffers, which are dynamically assigned to any of the running thread when they are expected to improve the system performance. Those buffers that are not allocated to any core are powered off to save energy. Experimental results show that FOS-Mt significantly reduces both static and dynamic energy consumption over other conventional cache organizations like NUCA or shared caches with the same storage capacity. Compared to the widely known cache decay approach, FOS-Mt achieves an improvement in the energy delay product by 19.3% on average. Moreover, despite the fact that FOS-Mt is an energy-aware architecture, performance is scarcely affected, since it is kept similar to that one achieved by conventional and cache decay approaches. | es_ES |
dc.description.sponsorship | This work has been supported by the Spanish Ministerio de Economia y Competitividad under grant RTI2018-098156-B-C51, and by the Generalitat Valenciana, Spain under grant AICO/2019/317. | es_ES |
dc.language | Inglés | es_ES |
dc.publisher | Elsevier | es_ES |
dc.relation.ispartof | Future Generation Computer Systems | es_ES |
dc.rights | Reserva de todos los derechos | es_ES |
dc.subject | Cache hierarchy | es_ES |
dc.subject | Multicores | es_ES |
dc.subject | Energy efficiency | es_ES |
dc.subject | On-chip photonic network | es_ES |
dc.subject.classification | ARQUITECTURA Y TECNOLOGIA DE COMPUTADORES | es_ES |
dc.title | An efficient cache flat storage organization for multithreaded workloads for low power processors | es_ES |
dc.type | Artículo | es_ES |
dc.identifier.doi | 10.1016/j.future.2019.11.024 | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/RTI2018-098156-B-C51/ES/TECNOLOGIAS INNOVADORAS DE PROCESADORES, ACELERADORES Y REDES, PARA CENTROS DE DATOS Y COMPUTACION DE ALTAS PRESTACIONES/ | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/GVA//AICO%2F2019%2F317/ | es_ES |
dc.rights.accessRights | Cerrado | es_ES |
dc.contributor.affiliation | Universitat Politècnica de València. Departamento de Informática de Sistemas y Computadores - Departament d'Informàtica de Sistemes i Computadors | es_ES |
dc.description.bibliographicCitation | Puche, J.; Petit Martí, SV.; Gómez Requena, ME.; Sahuquillo Borrás, J. (2020). An efficient cache flat storage organization for multithreaded workloads for low power processors. Future Generation Computer Systems. 110:1037-1054. https://doi.org/10.1016/j.future.2019.11.024 | es_ES |
dc.description.accrualMethod | S | es_ES |
dc.relation.publisherversion | https://doi.org/10.1016/j.future.2019.11.024 | es_ES |
dc.description.upvformatpinicio | 1037 | es_ES |
dc.description.upvformatpfin | 1054 | es_ES |
dc.type.version | info:eu-repo/semantics/publishedVersion | es_ES |
dc.description.volume | 110 | es_ES |
dc.relation.pasarela | S\398066 | es_ES |
dc.contributor.funder | Generalitat Valenciana | es_ES |
dc.contributor.funder | Agencia Estatal de Investigación | es_ES |
dc.description.references | S. Kaxiras, Z. Hu, M. Martonosi, Cache decay: exploiting generational behavior to reduce cache leakage power, in: Procs. of the 28th Annual International Symposium on Computer Architecture, ISCA’01, 2001, pp. 240–251. | es_ES |
dc.description.references | Sinharoy, B., Kalla, R. N., Tendler, J. M., Eickemeyer, R. J., & Joyner, J. B. (2005). POWER5 system microarchitecture. IBM Journal of Research and Development, 49(4.5), 505-521. doi:10.1147/rd.494.0505 | es_ES |
dc.description.references | M. Qureshi, Y. Patt, Utility-based cache partitioning: A low-overhead, high-performance, runtime mechanism to partition shared caches, in: MICRO, 2006, pp. 423–432. | es_ES |
dc.description.references | Selfa, V., Sahuquillo, J., Gómez, M. E., & Gómez, C. (2018). Efficient selective multicore prefetching under limited memory bandwidth. Journal of Parallel and Distributed Computing, 120, 32-43. doi:10.1016/j.jpdc.2018.05.002 | es_ES |
dc.description.references | A. Shacham, K. Bergman, L. Carloni, On the design of a photonic network-on-chip, in: Networks-on-Chip, NOCS 2007, pp. 53–64. | es_ES |
dc.description.references | G. Chen, H. Chen, M. Haurylau, N. Nelson, P.M. Fauchet, E.G. Friedman, D. Albonesi, Predictions of CMOS compatible on-chip optical interconnect, in: Procs. of Int. Workshop on System Level Interconnect Prediction, SLIP ’05, 2005, pp. 13–20. | es_ES |
dc.description.references | J. Pang, C. Dwyer, A.R. Lebeck, Exploiting emerging technologies for nanoscale photonic networks-on-chip, in: Procs. of 6th Int. Workshop on NoC Architectures, NoCArc ’13, pp. 53–58. | es_ES |
dc.description.references | Soref, R., & Bennett, B. (1987). Electrooptical effects in silicon. IEEE Journal of Quantum Electronics, 23(1), 123-129. doi:10.1109/jqe.1987.1073206 | es_ES |
dc.description.references | García-Guirado, A., Fernández-Pascual, R., García, J. M., & Bartolini, S. (2014). Managing resources dynamically in hybrid photonic-electronic networks-on-chip. Concurrency and Computation: Practice and Experience, 26(15), 2530-2550. doi:10.1002/cpe.3332 | es_ES |
dc.description.references | D. Vantrease, N. Binkert, R. Schreiber, M. Lipasti, Light speed arbitration and flow control for nanophotonic interconnects, in: Microarchitecture, 2009. MICRO-42. 42nd Annual IEEE/ACM International Symposium, pp. 304–315. | es_ES |
dc.description.references | S. Werner, J. Navaridas, M. Lujan, Designing low-power, low-latency networks-on-chip by optimally combining electrical and optical Links, in: 2017 IEEE Int. Symp. of High Performance Computer Architecture, IEEE, Manchester, UK. | es_ES |
dc.description.references | Bahirat, S., & Pasricha, S. (2014). METEOR. ACM Transactions on Embedded Computing Systems, 13(3s), 1-33. doi:10.1145/2567940 | es_ES |
dc.description.references | R. Morris, A.K. Kodi, A. Louri, Dynamic reconfiguration of 3D photonic networks-on-chip for maximizing performance and improving fault tolerance, in: 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 282–293. http://dx.doi.org/10.1109/MICRO.2012.34. | es_ES |
dc.description.references | R. Ubal, J. Sahuquillo, S. Petit, P. Lopez, Multi2Sim: A simulation framework to evaluate multicore-multithreaded processors, in: Int. Symp. on Computer Architecture and High Performance Computing, pp. 62–68. http://dx.doi.org/10.1109/SBAC-PAD.2007.17. | es_ES |
dc.description.references | Rosenfeld, P., Cooper-Balis, E., & Jacob, B. (2011). DRAMSim2: A Cycle Accurate Memory System Simulator. IEEE Computer Architecture Letters, 10(1), 16-19. doi:10.1109/l-ca.2011.4 | es_ES |
dc.description.references | N. Muralimanohar, R. Balasubramonian, N.P. Jouppi, CACTI 6.0: A tool to model large caches, in: HP Laboratories, 2009. | es_ES |
dc.description.references | . Man-Lap Li, R. Sasanka, S.V. Adve, . Yen-Kuang Chen, E. Debes, The ALPBench benchmark suite for complex multimedia applications, in: Proceedings of the IEEE International Workload Characterization Symposium, 2005, IIWC’05, 2015. | es_ES |
dc.description.references | Valero, A., Petit, S., Sahuquillo, J., Kaeli, D. R., & Duato, J. (2015). A reuse-based refresh policy for energy-aware eDRAM caches. Microprocessors and Microsystems, 39(1), 37-48. doi:10.1016/j.micpro.2014.12.001 | es_ES |
dc.description.references | Valero, A., Sahuquillo, J., Petit, S., López, P., & Duato, J. (2012). Combining recency of information with selective random and a victim cache in last-level caches. ACM Transactions on Architecture and Code Optimization, 9(3), 1-20. doi:10.1145/2355585.2355589 | es_ES |
dc.description.references | S. Kim, D. Chandra, D. Solihin, Fair cache sharing and partitioning in a chip multiprocessor architecture, in: PACT, 2004, pp. 111–122. | es_ES |
dc.description.references | Sahuquillo, J., & Pont, A. (2000). Splitting the data cache: a survey. IEEE Concurrency, 8(3), 30-35. doi:10.1109/4434.865890 | es_ES |
dc.description.references | J.A. Rivers, E.S. Tam, G.S. Tyson, E.S. Davidson, M.K. Farrens, Utilizing reuse information in data cache management, in: Proceedings of the 12th International Conference on Supercomputing, ICS 1998, Melbourne, Australia, July 13–17, 1998, 1998, pp. 449–456. http://dx.doi.org/10.1145/277830.277941. URL http://doi.acm.org/10.1145/277830.277941. | es_ES |
dc.description.references | J. Sahuquillo, A. Pont, The filter cache: A run-time cache management approach1, in: 25th EUROMICRO ’99 Conference, Informatics: Theory and Practice for the New Millenium, 8–10 September 1999, Milan, Italy, 1999, pp. 1424–1431. http://dx.doi.org/10.1109/EURMIC.1999.794504. URL https://doi.org/10.1109/EURMIC.1999.794504. | es_ES |
dc.description.references | Chishti, Z., Powell, M. D., & Vijaykumar, T. N. (2005). Optimizing Replication, Communication, and Capacity Allocation in CMPs. ACM SIGARCH Computer Architecture News, 33(2), 357-368. doi:10.1145/1080695.1070001 | es_ES |
dc.description.references | Hardavellas, N., Ferdman, M., Falsafi, B., & Ailamaki, A. (2009). Reactive NUCA. ACM SIGARCH Computer Architecture News, 37(3), 184-195. doi:10.1145/1555815.1555779 | es_ES |
dc.description.references | Tsai, P.-A., Beckmann, N., & Sanchez, D. (2017). Jenga. ACM SIGARCH Computer Architecture News, 45(2), 652-665. doi:10.1145/3140659.3080214 | es_ES |
dc.description.references | D. Vantrease, R. Schreiber, M. Monchiero, M. McLaren, N. Jouppi, M. Fiorentino, A. Davis, N. Binkert, R. Beausoleil, J. Ahn, Corona: System implications of emerging nanophotonic technology, in: Computer Architecture, 2008. ISCA ’08. 35th International Symposium on, pp. 153–164. http://dx.doi.org/10.1109/ISCA.2008.35. | es_ES |
dc.description.references | Y. Pan, J. Kim, G. Memik, FlexiShare: Channel sharing for an energy-efficient nanophotonic crossbar, in: High Performance Computer Architecture, 2010 IEEE 16th International Symposium, pp. 1–12. http://dx.doi.org/10.1109/HPCA.2010.5416626. | es_ES |
dc.description.references | Pan, Y., Kumar, P., Kim, J., Memik, G., Zhang, Y., & Choudhary, A. (2009). Firefly. ACM SIGARCH Computer Architecture News, 37(3), 429-440. doi:10.1145/1555815.1555808 | es_ES |
dc.description.references | Li, C., Browning, M., Gratz, P. V., & Palermo, S. (2014). LumiNOC: A Power-Efficient, High-Performance, Photonic Network-on-Chip. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 33(6), 826-838. doi:10.1109/tcad.2014.2320510 | es_ES |