Mostrar el registro sencillo del ítem
dc.contributor.author | Franco-Salvador, Marc | es_ES |
dc.contributor.author | Gupta, Parth Alokkumar | es_ES |
dc.contributor.author | Rosso, Paolo | es_ES |
dc.contributor.author | Banchs, Rafael | es_ES |
dc.date.accessioned | 2017-06-07T08:27:54Z | |
dc.date.available | 2017-06-07T08:27:54Z | |
dc.date.issued | 2016-11-01 | |
dc.identifier.issn | 0950-7051 | |
dc.identifier.uri | http://hdl.handle.net/10251/82493 | |
dc.description | This is the author’s version of a work that was accepted for publication in Knowledge-Based Systems. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Knowledge-Based Systems 111 (2016) 87–99. DOI 10.1016/j.knosys.2016.08.004. | es_ES |
dc.description.abstract | Cross-language (CL) plagiarism detection aims at detecting plagiarised fragments of text among documents in different languages. The main research question of this work is on whether knowledge graph representations and continuous space representations can complement to each other and improve the state-of-the-art performance in CL plagiarism detection methods. In this sense, we propose and evaluate hybrid models to assess the semantic similarity of two segments of text in different languages. The proposed hybrid models combine knowledge graph representations with continuous space representations aiming at exploiting their complementarity in capturing different aspects of cross-lingual similarity. We also present the continuous word alignment-based similarity analysis, a new model to estimate similarity between text fragments. We compare the aforementioned approaches with several state-of-the-art models in the task of CL plagiarism detection and study their performance in detecting different length and obfuscation types of plagiarism cases. We conduct experiments over Spanish-English and GermanEnglish datasets. Experimental results show that continuous representations allow the continuous word alignment-based similarity analysis model to obtain competitive results and the knowledge-based document similarity model to outperform the state-of-the-art in CL plagiarism detection. © 2016 Elsevier B.V. All rights reserved. | es_ES |
dc.description.sponsorship | This research has been carried out in framework of the FPI-UPV pre-doctoral grant (No de registro - 3505) awarded to Parth Gupta and in the framework of the national projects DIANA-APPLICATIONS - Finding Hidden Knowledge in Texts: Applications (TIN2012-38603-C02-01), and SomEMBED: SOcial Media language understanding - EMBEDing contexts (TIN2015-71147-C2-1-P). We would like to thank Martin Potthast, Daniel Ortiz-Martinez, and Luis A. Leiva for their support and comments during this research. | en_EN |
dc.language | Inglés | es_ES |
dc.publisher | Elsevier | es_ES |
dc.relation.ispartof | Knowledge-Based Systems | es_ES |
dc.rights | Reserva de todos los derechos | es_ES |
dc.subject | Cross-language | es_ES |
dc.subject | Plagiarism detection | es_ES |
dc.subject | Continuous representations | es_ES |
dc.subject | Knowledge graphs | es_ES |
dc.subject | Multilingual semantic network | es_ES |
dc.subject.classification | LENGUAJES Y SISTEMAS INFORMATICOS | es_ES |
dc.title | Cross-language Plagiarism Detection over Continuous-space- and Knowledge Graph-based Representations of Language | es_ES |
dc.type | Artículo | es_ES |
dc.identifier.doi | 10.1016/j.knosys.2016.08.004 | |
dc.relation.projectID | info:eu-repo/grantAgreement/UPV//PRE-DOCTORAL GRANT%2F3505/ | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/MINECO//TIN2012-38603-C02-01/ES/DIANA-APPLICATIONS: FINDING HIDDEN KNOWLEDGE IN TEXTS: APPLICATIONS/ | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/MINECO//TIN2015-71147-C2-1-P/ES/COMPRENSION DEL LENGUAJE EN LOS MEDIOS DE COMUNICACION SOCIAL - REPRESENTANDO CONTEXTOS DE FORMA CONTINUA/ | es_ES |
dc.rights.accessRights | Abierto | es_ES |
dc.contributor.affiliation | Universitat Politècnica de València. Escola Tècnica Superior d'Enginyeria Informàtica | es_ES |
dc.description.bibliographicCitation | Franco-Salvador, M.; Gupta, PA.; Rosso, P.; Banchs, R. (2016). Cross-language Plagiarism Detection over Continuous-space- and Knowledge Graph-based Representations of Language. Knowledge-Based Systems. 111:87-99. https://doi.org/10.1016/j.knosys.2016.08.004 | es_ES |
dc.description.accrualMethod | S | es_ES |
dc.relation.publisherversion | http://dx.doi.org/10.1016/j.knosys.2016.08.004 | es_ES |
dc.description.upvformatpinicio | 87 | es_ES |
dc.description.upvformatpfin | 99 | es_ES |
dc.type.version | info:eu-repo/semantics/publishedVersion | es_ES |
dc.description.volume | 111 | es_ES |
dc.relation.senia | 326671 | es_ES |
dc.contributor.funder | Ministerio de Economía y Competitividad | es_ES |
dc.contributor.funder | Universitat Politècnica de València | es_ES |