Cross-language source code re-use detection using latent semantic analysis

Flores Sáez, Enrique; Barrón-Cedeño, Luis Alberto; Moreno Boronat, Lidia Ana; Rosso, Paolo

doi:10.3217/jucs-021-13-1708

Identificarse

Buscar en RiuNet

Listar

Todo RiuNet
Esta colección

Mi cuenta

Acceder

Estadísticas

Ver Estadísticas de uso

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Cross-language source code re-use detection using latent semantic analysis

Mostrar el registro sencillo del ítem

Ficheros en el ítem

Nombre: Flores-et-al-JUCS ...

Tamaño: 247.7Kb

Formato: PDF

Descripción: Versión editorial ...

Abrir

dc.contributor.author	Flores Sáez, Enrique	es_ES
dc.contributor.author	Barrón-Cedeño, Luis Alberto	es_ES
dc.contributor.author	Moreno Boronat, Lidia Ana	es_ES
dc.contributor.author	Rosso, Paolo	es_ES
dc.date.accessioned	2016-05-04T17:47:18Z
dc.date.available	2016-05-04T17:47:18Z
dc.date.issued	2015
dc.identifier.issn	0948-695X
dc.identifier.uri	http://hdl.handle.net/10251/63642
dc.description.abstract	[EN] Nowadays, Internet is the main source to get information from blogs, encyclopedias, discussion forums, source code repositories, and more resources which are available just one click away. The temptation to re-use these materials is very high. Even source codes are easily available through a simple search on the Web. There is a need of detecting potential instances of source code re-use. Source code re-use detection has usually been approached comparing source codes in their compiled version. When dealing with cross-language source code re-use, traditional pproaches can deal only with the programming languages supported by the compiler. We assume that a source code is a piece of text ,with its syntax and structure, so we aim at applying models for free text re-use detection to source code. In this paper we compare a Latent Semantic Analysis (LSA) approach with previously used text re-use detection models for measuring cross-language similarity in source code. The LSA-based approach shows slightly better results than the other models, being able to distinguish between re-used and related source codes with a high performance.	es_ES
dc.description.sponsorship	This work was partially supported by Universitat Polit`ecnica de Val`encia, WIQ-EI (IRSES grant n. 269180), and DIANA-APPLICATIONS (TIN2012- 38603-C02- 01) project. The work of the fourth author is also supported by VLC/CAMPUS Microcluster on Multimodal Interaction in Intelligent Systems.
dc.language	Inglés	es_ES
dc.publisher	Graz University of Technology, Institut für Informationssysteme und Computer Medien (IICM)	es_ES
dc.relation.ispartof	Journal of Universal Computer Science	es_ES
dc.rights	Reserva de todos los derechos	es_ES
dc.subject	Cross-language re-use detection	es_ES
dc.subject	Source code	es_ES
dc.subject	Plagiarism	es_ES
dc.subject	Latent semantic analysis	es_ES
dc.subject.classification	CIENCIAS DE LA COMPUTACION E INTELIGENCIA ARTIFICIAL	es_ES
dc.subject.classification	LENGUAJES Y SISTEMAS INFORMATICOS	es_ES
dc.title	Cross-language source code re-use detection using latent semantic analysis	es_ES
dc.type	Artículo	es_ES
dc.identifier.doi	10.3217/jucs-021-13-1708
dc.relation.projectID	info:eu-repo/grantAgreement/MICINN//TIN2012-38603-C02-01/ES/DIANA-APPLICATIONS: FINDING HIDDEN KNOWLEDGE IN TEXTS: APPLICATIONS/	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/EC/FP7/269180/EU/Web Information Quality Evaluation Initiative/
dc.rights.accessRights	Abierto	es_ES
dc.contributor.affiliation	Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació	es_ES
dc.description.bibliographicCitation	Flores Sáez, E.; Barrón-Cedeño, LA.; Moreno Boronat, LA.; Rosso, P. (2015). Cross-language source code re-use detection using latent semantic analysis. Journal of Universal Computer Science. 21(13):1708-1725. https://doi.org/10.3217/jucs-021-13-1708	es_ES
dc.description.accrualMethod	S	es_ES
dc.relation.publisherversion	http://dx.doi.org/10.3217/jucs-021-13-1708	es_ES
dc.description.upvformatpinicio	1708	es_ES
dc.description.upvformatpfin	1725	es_ES
dc.type.version	info:eu-repo/semantics/publishedVersion	es_ES
dc.description.volume	21	es_ES
dc.description.issue	13	es_ES
dc.relation.senia	303947	es_ES
dc.contributor.funder	European Commission
dc.contributor.funder	Ministerio de Ciencia e Innovación

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem

Cross-language source code re-use detection using latent semantic analysis

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Buscar en RiuNet

Listar

Todo RiuNet

Esta colección

Mi cuenta

Estadísticas

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Cross-language source code re-use detection using latent semantic analysis

Ficheros en el ítem

Este ítem aparece en la(s) siguiente(s) colección(ones)