Mostrar el registro sencillo del ítem
dc.contributor.author | Montaner Mas, Héctor | es_ES |
dc.contributor.author | Silla Jiménez, Federico | es_ES |
dc.contributor.author | Fröning, Holger | es_ES |
dc.contributor.author | Duato Marín, José Francisco | es_ES |
dc.date.accessioned | 2014-03-03T08:13:19Z | |
dc.date.issued | 2012-06 | |
dc.identifier.issn | 1386-7857 | |
dc.identifier.uri | http://hdl.handle.net/10251/36069 | |
dc.description.abstract | Improvements in parallel computing hardware usually involve increments in the number of available resources for a given application such as the number of computing cores and the amount of memory. In the case of shared-memory computers, the increase in computing resources and available memory is usually constrained by the coherency protocol, whose overhead rises with system size, limiting the scalability of the final system. In this paper we propose an efficient and cost-effective way to increase the memory available for a given application by leveraging free memory in other computers in the cluster. Our proposal is based on the observation that many applications benefit from having more memory resources but do not require more computing cores, thus reducing the requirements for cache coherency and allowing a simpler implementation and better scalability. Simulation results show that, when additional mechanisms intended to hide remote memory latency are used, execution time of applications that use our proposal is similar to the time required to execute them in a computer populated with enough local memory, thus validating the feasibility of our proposal. We are currently building a prototype that implements our ideas. The first results from real executions in this prototype demonstrate not only that our proposal works but also that it can efficiently execute applications that make use of remote memory resources. © 2011 Springer Science+Business Media, LLC. | es_ES |
dc.description.sponsorship | This work has been supported by PROMETEO from Generalitat Valenciana (GVA) under Grant PROMETEO/2008/060. | en_EN |
dc.format.extent | 23 | es_ES |
dc.language | Inglés | es_ES |
dc.publisher | Springer Verlag (Germany) | es_ES |
dc.relation.ispartof | Cluster Computing | es_ES |
dc.rights | Reserva de todos los derechos | es_ES |
dc.subject | Cluster | es_ES |
dc.subject | HyperTransport | es_ES |
dc.subject | Memory aggregation | es_ES |
dc.subject.classification | ARQUITECTURA Y TECNOLOGIA DE COMPUTADORES | es_ES |
dc.title | A new degree of freedom for memory allocation in clusters | es_ES |
dc.type | Artículo | es_ES |
dc.embargo.lift | 10000-01-01 | |
dc.embargo.terms | forever | es_ES |
dc.identifier.doi | 10.1007/s10586-010-0150-7 | |
dc.relation.projectID | info:eu-repo/grantAgreement/Generalitat Valenciana//PROMETEO08%2F2008%2F060/ES/Extensión de la tecnología de red hypertransport para la mejora de la escalabilidad de los servidores de internet/ | es_ES |
dc.rights.accessRights | Abierto | es_ES |
dc.contributor.affiliation | Universitat Politècnica de València. Departamento de Informática de Sistemas y Computadores - Departament d'Informàtica de Sistemes i Computadors | es_ES |
dc.description.bibliographicCitation | Montaner Mas, H.; Silla Jiménez, F.; Fröning, H.; Duato Marín, JF. (2012). A new degree of freedom for memory allocation in clusters. Cluster Computing. 15(2):101-123. https://doi.org/10.1007/s10586-010-0150-7 | es_ES |
dc.description.accrualMethod | S | es_ES |
dc.relation.publisherversion | http://link.springer.com/article/10.1007%2Fs10586-010-0150-7 | es_ES |
dc.description.upvformatpinicio | 101 | es_ES |
dc.description.upvformatpfin | 123 | es_ES |
dc.type.version | info:eu-repo/semantics/publishedVersion | es_ES |
dc.description.volume | 15 | es_ES |
dc.description.issue | 2 | es_ES |
dc.relation.senia | 206486 | |
dc.contributor.funder | Generalitat Valenciana | es_ES |
dc.description.references | 3leaf Systems: http://www.3leafsystems.com | es_ES |
dc.description.references | Acharya, A., Setia, S.: Availability and utility of idle memory in workstation clusters. ACM SIGMETRICS Perform. Eval. Rev. 27(1), 35–46 (1999). doi: 10.1145/301464.301478 | es_ES |
dc.description.references | Anderson, T., Culler, D., Patterson, D.: A case for NOW (Networks of Workstations). IEEE MICRO 15(1), 54–64 (1995). doi: 10.1109/40.342018 | es_ES |
dc.description.references | HyperTransport Technology Consortium. HyperTransport I/O Link Specification Revision 3.10 (2008). Available at http://www.hypertransport.org | es_ES |
dc.description.references | Bienia, C., Kumar, S., et al.: The parsec benchmark suite: Characterization and architectural implications. In: Proceedings of the 17th PACT (2008) | es_ES |
dc.description.references | Chapman, M., Heiser, G.: vNUMA: A virtual shared-memory multiprocessor. In: Proceedings of the 2009 USENIX Annual Technical Conference, San Diego, USA, 2000, pp. 349–362. (2009) | es_ES |
dc.description.references | Charles, P., Grothoff, C., Saraswat, V., et al.: X10: an object-oriented approach to non-uniform cluster computing. ACM SIGPLAN Not. 40(10), 519–538 (2005) | es_ES |
dc.description.references | Consortium, H.: HyperTransport High Node Count, Slides. http://www.hypertransport.org/default.cfm?page=HighNodeCountSpecification | es_ES |
dc.description.references | Conway, P., Hughes, B.: The AMD opteron northbridge architecture. IEEE MICRO 27(2), 10–21 (2007). doi: 10.1109/MM.2007.43 | es_ES |
dc.description.references | Conway, P., Kalyanasundharam, N., Donley, G., et al.: Blade computing with the AMD Opteron processor (Magny-Cours). Hot chips 21 (2009) | es_ES |
dc.description.references | Duato, J., Silla, F., Yalamanchili, S., et al.: Extending HyperTransport protocol for improved scalability. First International Workshop on HyperTransport Research and Applications (2009) | es_ES |
dc.description.references | Feeley, M.J., Morgan, W.E., Pighin, E.P., Karlin, A.R., Levy, H.M., Thekkath, C.A.: Implementing global memory management in a workstation cluster. In: SOSP ’95: Proceedings of the Fifteenth ACM Symposium on Operating Systems Principles, pp. 201–212. ACM, New York (1995). doi: 10.1145/224056.224072 | es_ES |
dc.description.references | Fröning, H., Litz, H.: Efficient hardware support for the partitioned global address space. In: 10th Workshop on Communication Architecture for Clusters (2010) | es_ES |
dc.description.references | Fröning, H., Nuessle, M., Slogsnat, D., Litz, H., Brüening, U.: The HTX-board: a rapid prototyping station. In: 3rd annual FPGAworld Conference (2006) | es_ES |
dc.description.references | Garcia-Molina, H., Salem, K.: Main memory database systems: an overview. IEEE Trans. Knowl. Data Eng. 4(6), 509–516 (1992). doi: 10.1109/69.180602 | es_ES |
dc.description.references | Gaussian 03: http://www.gaussian.com | es_ES |
dc.description.references | Gray, J., Liu, D.T., Nieto-Santisteban, M., et al.: Scientific data management in the coming decade. SIGMOD Rec. 34(4), 34–41 (2005). doi: 10.1145/1107499.1107503 | es_ES |
dc.description.references | IBM journal of Research and Development staff: Overview of the IBM Blue Gene/P project. IBM J. Res. Dev. 52(1/2), 199–220 (2008) | es_ES |
dc.description.references | IBM z Series: http://www.ibm.com/systems/z | es_ES |
dc.description.references | In-Memory Database Systems (IMDSs) Beyond the Terabyte Size Boudary: http://www.mcobject.com/130/EmbeddedDatabaseWhitePapers.htm | es_ES |
dc.description.references | Keltcher, C., McGrath, K., Ahmed, A., Conway, P.: The AMD opteron processor for multiprocessor servers. Micro IEEE 23(2), 66–76 (2003). doi: 10.1109/MM.2003.1196116 | es_ES |
dc.description.references | Kottapalli, S., Baxter, J.: Nehalem-EX CPU architecture. Hot chips 21 (2009) | es_ES |
dc.description.references | Liang, S., Noronha, R., Panda, D.: Swapping to remote memory over infiniband: an approach using a high performance network block device. In: Cluster Computing, 2005. IEEE International, pp. 1–10. (2005) doi: 10.1109/CLUSTR.2005.347050 | es_ES |
dc.description.references | Litz, H., Fröning, H., Nuessle, M., Brüening, U.: A hypertransport network interface controller for ultra-low latency message transfers. HyperTransport Consortium White Paper (2007) | es_ES |
dc.description.references | Litz, H., Fröning, H., Nuessle, M., Brüening, U.: VELO: A novel communication engine for ultra-low latency message transfers. In: 37th International Conference on Parallel Processing, 2008. ICPP ’08, pp. 238–245 (2008). doi: 10.1109/ICPP.2008.85 | es_ES |
dc.description.references | Magnusson, P., Christensson, M., Eskilson, J., et al.: Simics: a full system simulation platform. Computer 35(2), 50–58 (2002). doi: 10.1109/2.982916 | es_ES |
dc.description.references | Martin, M., Sorin, D., Beckmann, B., et al.: Multifacet’s general execution-driven multiprocessor simulator (GEMS) toolset. ACM SIGARCH Comput. Archit. News 33(4), 92–99 (2005) doi: 10.1145/1105734.1105747 | es_ES |
dc.description.references | MBA3 NC Series Catalog: http://www.fujitsu.com/global/services/computing/storage/hdd/ehdd/mba3073nc-mba3300nc.html | es_ES |
dc.description.references | McCalpin, J.D.: Memory bandwidth and machine balance in current high performance computers. In: IEEE Computer Society Technical Committee on Computer Architecture (TCCA) Newsletter, pp. 19–25 (1995) | es_ES |
dc.description.references | NUMAChip: http://www.numachip.com/ | es_ES |
dc.description.references | Oguchi, M., Kitsuregawa, M.: Using available remote memory dynamically for parallel data mining application on ATM-connected PC cluster. In: IPDPS 2000. Proceedings, 14th International, pp. 411–420 (2000). doi: 10.1109/IPDPS.2000.846014 | es_ES |
dc.description.references | Oleszkiewicz, J., Xiao, L., Liu, Y.: Parallel network RAM: effectively utilizing global cluster memory for large data-intensive parallel programs. In: International Conference on Parallel Processing, 2004. ICPP 2004, vol. 1, pp. 353–360 (2004). doi: 10.1109/ICPP.2004.1327942 | es_ES |
dc.description.references | Ronstrom, M., Thalmann, L.: MySQL cluster architecture overview. Technical White Paper. MySQL (2004) | es_ES |
dc.description.references | ScaleMP: http://www.scalemp.com | es_ES |
dc.description.references | SGI: Technical advances in the SGI Altix UV architecture, White Paper. http://www.sgi.com/products/servers/altix/uv/ | es_ES |
dc.description.references | Slogsnat, D., Giese, A., Nüssle, M., Brüning, U.: An open-source HyperTransport core. ACM Trans. Reconfigurable Technol. Syst. 1(3), 1–21 (2008). doi: 10.1007/s10586-010-0150-7 | es_ES |
dc.description.references | Szalay, A.S., Gray, J., vandenBerg, J.: Petabyte Scale Data Mining: Dream or Reality? CoRR cs.DB/0208013 (2002) | es_ES |
dc.description.references | Tuck, J., Ceze, L., Torrellas, J.: Scalable cache miss handling for high memory-level parallelism. In: Microarchitecture, 2006. MICRO-39. 39th Annual IEEE/ACM International Symposium on (2006) | es_ES |
dc.description.references | Violin Memory: http://violin-memory.com | es_ES |
dc.description.references | Dynamic Logical Partitioning. White Paper: http://www.ibm.com/systems/p/hardware/whitepapers/dlpar.html | es_ES |
dc.description.references | Yelick, K.: Computer architecture: Opportunities and challenges for scalable applications. Sandia CSRI Workshop on Next-generation scalable applications: When MPI-only is not enough (2008) | es_ES |
dc.description.references | Yelick, K.: Programming models: Opportunities and challenges for scalable applications. Sandia CSRI Workshop on Next-generation scalable applications: When MPI-only is not enough (2008) | es_ES |