High performance lattice reduction on heterogeneous computing platform

Jozsa, Csaba M; Domene Oltra, Fernando; Vidal Maciá, Antonio Manuel; Piñero Sipán, María Gemma; González Salvador, Alberto

doi:10.1007/s11227-014-1201-2

Identificarse

Buscar en RiuNet

Listar

Todo RiuNet
Esta colección

Mi cuenta

Acceder

Estadísticas

Ver Estadísticas de uso

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

High performance lattice reduction on heterogeneous computing platform

Mostrar el registro sencillo del ítem

Ficheros en el ítem

Nombre: 10.1007_s11227-01 ...

Tamaño: 433.9Kb

Formato: PDF

Descripción: Versión del Autor.

Abrir

Nombre: Csaba M. Józsa;Do ...

Tamaño: 501.5Kb

Formato: PDF

Descripción: Versión editorial

Solicitar una copia al autor

dc.contributor.author	Jozsa, Csaba M	es_ES
dc.contributor.author	Domene Oltra, Fernando	es_ES
dc.contributor.author	Vidal Maciá, Antonio Manuel	es_ES
dc.contributor.author	Piñero Sipán, María Gemma	es_ES
dc.contributor.author	González Salvador, Alberto	es_ES
dc.date.accessioned	2015-04-28T07:56:54Z
dc.date.available	2015-04-28T07:56:54Z
dc.date.issued	2014-11
dc.identifier.issn	0920-8542
dc.identifier.uri	http://hdl.handle.net/10251/49342
dc.description	The final publication is available at Springer via http://dx.doi.org/10.1007/s11227-014-1201-2	es_ES
dc.description.abstract	The lattice reduction (LR) technique has become very important in many engineering fields. However, its high complexity makes difficult its use in real-time applications, especially in applications that deal with large matrices. As a solution, the modified block LLL (MB-LLL) algorithm was introduced, where several levels of parallelism were exploited: (a) fine-grained parallelism was achieved through the cost-reduced all-swap LLL (CR-AS-LLL) algorithm introduced together with the MB-LLL by Jzsa et al. (Proceedings of the tenth international symposium on wireless communication systems, 2013) and (b) coarse-grained parallelism was achieved by applying the block-reduction concept presented by Wetzel (Algorithmic number theory. Springer, New York, pp 323-337, 1998). In this paper, we present the cost-reduced MB-LLL (CR-MB-LLL) algorithm, which allows to significantly reduce the computational complexity of the MB-LLL by allowing the relaxation of the first LLL condition while executing the LR of submatrices, resulting in the delay of the Gram-Schmidt coefficients update and by using less costly procedures during the boundary checks. The effects of complexity reduction and implementation details are analyzed and discussed for several architectures. A mapping of the CR-MB-LLL on a heterogeneous platform is proposed and it is compared with implementations running on a dynamic parallelism enabled GPU and a multi-core CPU. The mapping on the architecture proposed allows a dynamic scheduling of kernels where the overhead introduced is hidden by the use of several CUDA streams. Results show that the execution time of the CR-MB-LLL algorithm on the heterogeneous platform outperforms the multi-core CPU and it is more efficient than the CR-AS-LLL algorithm in case of large matrices.	es_ES
dc.description.sponsorship	Financial support for this study was provided by grants TAMOP-4.2.1./B-11/2/KMR-2011-0002, TAMOP-4.2.2/B-10/1-2010-0014 from the Pazmany Peter Catholic University, European Union ERDF, Spanish Government through TEC2012-38142-C04-01 project and Generalitat Valenciana through PROMETEO/2009/013 project.	en_EN
dc.language	Inglés	es_ES
dc.publisher	Springer Verlag (Germany)	es_ES
dc.relation.ispartof	Journal of Supercomputing	es_ES
dc.rights	Reserva de todos los derechos	es_ES
dc.subject	Lattice reduction	es_ES
dc.subject	LLL	es_ES
dc.subject	GPU	es_ES
dc.subject	CUDA	es_ES
dc.subject	OpenMP	es_ES
dc.subject.classification	CIENCIAS DE LA COMPUTACION E INTELIGENCIA ARTIFICIAL	es_ES
dc.subject.classification	TEORIA DE LA SEÑAL Y COMUNICACIONES	es_ES
dc.title	High performance lattice reduction on heterogeneous computing platform	es_ES
dc.type	Artículo	es_ES
dc.identifier.doi	10.1007/s11227-014-1201-2
dc.relation.projectID	info:eu-repo/grantAgreement/PPCU//TAMOP-4.2.1./B-11/2/KMR-2011-0002/	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/PPCU//TAMOP-4.2.2/B-10/1-2010-0014/	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/AEI//TEC2012-38142-C04-01/	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/Generalitat Valenciana//PROMETEO09%2F2009%2F013/ES/Computacion de altas prestaciones sobre arquitecturas actuales en porblemas de procesado múltiple de señal/	es_ES
dc.rights.accessRights	Abierto	es_ES
dc.contributor.affiliation	Universitat Politècnica de València. Instituto Universitario de Telecomunicación y Aplicaciones Multimedia - Institut Universitari de Telecomunicacions i Aplicacions Multimèdia	es_ES
dc.contributor.affiliation	Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació	es_ES
dc.contributor.affiliation	Universitat Politècnica de València. Departamento de Comunicaciones - Departament de Comunicacions	es_ES
dc.description.bibliographicCitation	Jozsa, CM.; Domene Oltra, F.; Vidal Maciá, AM.; Piñero Sipán, MG.; González Salvador, A. (2014). High performance lattice reduction on heterogeneous computing platform. Journal of Supercomputing. 70(2):772-785. https://doi.org/10.1007/s11227-014-1201-2	es_ES
dc.description.accrualMethod	S	es_ES
dc.relation.publisherversion	http://dx.doi.org/10.1007/s11227-014-1201-2	es_ES
dc.description.upvformatpinicio	772	es_ES
dc.description.upvformatpfin	785	es_ES
dc.type.version	info:eu-repo/semantics/publishedVersion	es_ES
dc.description.volume	70	es_ES
dc.description.issue	2	es_ES
dc.relation.senia	279028
dc.contributor.funder	Generalitat Valenciana	es_ES
dc.contributor.funder	Agencia Estatal de Investigación
dc.contributor.funder	Pázmány Péter Catholic University
dc.description.references	Józsa CM, Domene F, Piñero G, González A, Vidal AM (2013) Efficient GPU implementation of lattice-reduction-aided multiuser precoding. In: Proceedings of the tenth international symposium on wireless communication systems (ISWCS 2013)	es_ES
dc.description.references	Wetzel S (1998) An efficient parallel block-reduction algorithm. In: Buhler JP (ed) Algorithmic number theory. Lecture notes in computer science, vol 1423. Springer, Berlin, Heidelberg, pp 323–337	es_ES
dc.description.references	Wubben D, Seethaler D, Jaldén J, Matz G (2011) Lattice reduction. Signal Process Mag IEEE 28(3):70–91	es_ES
dc.description.references	Lenstra AK, Lenstra HW, Lovász L (1982) Factoring polynomials with rational coefficients. Math Ann 261(4):515–534	es_ES
dc.description.references	Bremner MR (2012) Lattice basis reduction: an introduction to the LLL algorithm and its applications. CRC Press, USA	es_ES
dc.description.references	Wu D, Eilert J, Liu D (2008) A programmable lattice-reduction aided detector for MIMO-OFDMA. In: 4th IEEE international conference on circuits and systems for communications (ICCSC 2008), pp 293–297	es_ES
dc.description.references	Barbero LG, Milliner DL, Ratnarajah T, Barry JR, Cowan C (2009) Rapid prototyping of Clarkson’s lattice reduction for MIMO detection. In: IEEE international conference on communications (ICC’09), pp 1–5	es_ES
dc.description.references	Gestner B, Zhang W, Ma X, Anderson D (2011) Lattice reduction for MIMO detection: from theoretical analysis to hardware realization. IEEE Trans Circ Syst I Regul Pap 58(4):813–826	es_ES
dc.description.references	Shabany M, Youssef A, Gulak G (2013) High-throughput 0.13- $$\upmu $$ μ m CMOS lattice reduction core supporting 880 Mb/s detection. IEEE Trans Very Large Scale Integr (VLSI) Syst 21(5):848–861	es_ES
dc.description.references	Luo Y, Qiao S (2011) A parallel LLL algorithm. In: Proceedings of the fourth international C* conference on computer science and software engineering, pp 93–101	es_ES
dc.description.references	Backes W, Wetzel S (2011) Parallel lattice basis reduction—the road to many-core. In: IEEE 13th international conference on high performance computing and communications (HPCC)	es_ES
dc.description.references	Ahmad U, Amin A, Li M, Pollin S, Van der Perre L, Catthoor F (2011) Scalable block-based parallel lattice reduction algorithm for an SDR baseband processor. In: 2011 IEEE international conference on communications (ICC)	es_ES
dc.description.references	Villard G (1992) Parallel lattice basis reduction. In: Papers from the international symposium on symbolic and algebraic computation (ISSAC’92). ACM, New York	es_ES
dc.description.references	Domene F, Józsa CM, Vidal AM, Piñero G, Gonzalez A (2013) Performance analysis of a parallel lattice reduction algorithm on many-core architectures. In: Proceedings of the 13th international conference on computational and mathematical methods in science and engineering	es_ES
dc.description.references	Gestner B, Zhang W, Ma X, Anderson DV (2008) VLSI implementation of a lattice reduction algorithm for low-complexity equalization. In: 4th IEEE international conference on circuits and systems for communications (ICCSC 2008), pp 643–647	es_ES
dc.description.references	Burg A, Seethaler D, Matz G (2007) VLSI implementation of a lattice-reduction algorithm for multi-antenna broadcast precoding. In: IEEE international symposium on circuits and systems (ISCAS 2007), pp 673–676	es_ES
dc.description.references	Bruderer L, Studer C, Wenk M, Seethaler D, Burg A (2010) VLSI implementation of a low-complexity LLL lattice reduction algorithm for MIMO detection. In: Proceedings of 2010 IEEE international symposium on circuits and systems (ISCAS)	es_ES

Este ítem aparece en la(s) siguiente(s) colección(ones)

Artículos, conferencias, monografías [48344]

Mostrar el registro sencillo del ítem

High performance lattice reduction on heterogeneous computing platform

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Buscar en RiuNet

Listar

Todo RiuNet

Esta colección

Mi cuenta

Estadísticas

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

High performance lattice reduction on heterogeneous computing platform

Ficheros en el ítem

Este ítem aparece en la(s) siguiente(s) colección(ones)