Mostrar el registro sencillo del ítem
dc.contributor.author | Flores Sáez, Enrique | es_ES |
dc.contributor.author | Barrón-Cedeño, Luis Alberto | es_ES |
dc.contributor.author | Moreno Boronat, Lidia Ana | es_ES |
dc.contributor.author | Rosso, Paolo | es_ES |
dc.date.accessioned | 2016-05-04T17:47:18Z | |
dc.date.available | 2016-05-04T17:47:18Z | |
dc.date.issued | 2015 | |
dc.identifier.issn | 0948-695X | |
dc.identifier.uri | http://hdl.handle.net/10251/63642 | |
dc.description.abstract | [EN] Nowadays, Internet is the main source to get information from blogs, encyclopedias, discussion forums, source code repositories, and more resources which are available just one click away. The temptation to re-use these materials is very high. Even source codes are easily available through a simple search on the Web. There is a need of detecting potential instances of source code re-use. Source code re-use detection has usually been approached comparing source codes in their compiled version. When dealing with cross-language source code re-use, traditional pproaches can deal only with the programming languages supported by the compiler. We assume that a source code is a piece of text ,with its syntax and structure, so we aim at applying models for free text re-use detection to source code. In this paper we compare a Latent Semantic Analysis (LSA) approach with previously used text re-use detection models for measuring cross-language similarity in source code. The LSA-based approach shows slightly better results than the other models, being able to distinguish between re-used and related source codes with a high performance. | es_ES |
dc.description.sponsorship | This work was partially supported by Universitat Polit`ecnica de Val`encia, WIQ-EI (IRSES grant n. 269180), and DIANA-APPLICATIONS (TIN2012- 38603-C02- 01) project. The work of the fourth author is also supported by VLC/CAMPUS Microcluster on Multimodal Interaction in Intelligent Systems. | |
dc.language | Inglés | es_ES |
dc.publisher | Graz University of Technology, Institut für Informationssysteme und Computer Medien (IICM) | es_ES |
dc.relation.ispartof | Journal of Universal Computer Science | es_ES |
dc.rights | Reserva de todos los derechos | es_ES |
dc.subject | Cross-language re-use detection | es_ES |
dc.subject | Source code | es_ES |
dc.subject | Plagiarism | es_ES |
dc.subject | Latent semantic analysis | es_ES |
dc.subject.classification | CIENCIAS DE LA COMPUTACION E INTELIGENCIA ARTIFICIAL | es_ES |
dc.subject.classification | LENGUAJES Y SISTEMAS INFORMATICOS | es_ES |
dc.title | Cross-language source code re-use detection using latent semantic analysis | es_ES |
dc.type | Artículo | es_ES |
dc.identifier.doi | 10.3217/jucs-021-13-1708 | |
dc.relation.projectID | info:eu-repo/grantAgreement/MICINN//TIN2012-38603-C02-01/ES/DIANA-APPLICATIONS: FINDING HIDDEN KNOWLEDGE IN TEXTS: APPLICATIONS/ | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/EC/FP7/269180/EU/Web Information Quality Evaluation Initiative/ | |
dc.rights.accessRights | Abierto | es_ES |
dc.contributor.affiliation | Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació | es_ES |
dc.description.bibliographicCitation | Flores Sáez, E.; Barrón-Cedeño, LA.; Moreno Boronat, LA.; Rosso, P. (2015). Cross-language source code re-use detection using latent semantic analysis. Journal of Universal Computer Science. 21(13):1708-1725. https://doi.org/10.3217/jucs-021-13-1708 | es_ES |
dc.description.accrualMethod | S | es_ES |
dc.relation.publisherversion | http://dx.doi.org/10.3217/jucs-021-13-1708 | es_ES |
dc.description.upvformatpinicio | 1708 | es_ES |
dc.description.upvformatpfin | 1725 | es_ES |
dc.type.version | info:eu-repo/semantics/publishedVersion | es_ES |
dc.description.volume | 21 | es_ES |
dc.description.issue | 13 | es_ES |
dc.relation.senia | 303947 | es_ES |
dc.contributor.funder | European Commission | |
dc.contributor.funder | Ministerio de Ciencia e Innovación |