Cross-language Plagiarism Detection over Continuous-space- and Knowledge Graph-based Representations of Language

Franco-Salvador, Marc; Gupta, Parth Alokkumar; Rosso, Paolo; Banchs, Rafael

doi:10.1016/j.knosys.2016.08.004

Identificarse

Buscar en RiuNet

Listar

Todo RiuNet
Esta colección

Mi cuenta

Acceder

Estadísticas

Ver Estadísticas de uso

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Cross-language Plagiarism Detection over Continuous-space- and Knowledge Graph-based Representations of Language

Mostrar el registro sencillo del ítem

Ficheros en el ítem

Nombre: KNOSYS-Franco-et- ...

Tamaño: 636.9Kb

Formato: PDF

Descripción: Versión del Autor.

Abrir

Nombre: KNOSYS-Franco-et- ...

Tamaño: 681.6Kb

Formato: PDF

Descripción: Versión editorial

Solicitar una copia al autor

dc.contributor.author	Franco-Salvador, Marc	es_ES
dc.contributor.author	Gupta, Parth Alokkumar	es_ES
dc.contributor.author	Rosso, Paolo	es_ES
dc.contributor.author	Banchs, Rafael	es_ES
dc.date.accessioned	2017-06-07T08:27:54Z
dc.date.available	2017-06-07T08:27:54Z
dc.date.issued	2016-11-01
dc.identifier.issn	0950-7051
dc.identifier.uri	http://hdl.handle.net/10251/82493
dc.description	This is the author’s version of a work that was accepted for publication in Knowledge-Based Systems. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Knowledge-Based Systems 111 (2016) 87–99. DOI 10.1016/j.knosys.2016.08.004.	es_ES
dc.description.abstract	Cross-language (CL) plagiarism detection aims at detecting plagiarised fragments of text among documents in different languages. The main research question of this work is on whether knowledge graph representations and continuous space representations can complement to each other and improve the state-of-the-art performance in CL plagiarism detection methods. In this sense, we propose and evaluate hybrid models to assess the semantic similarity of two segments of text in different languages. The proposed hybrid models combine knowledge graph representations with continuous space representations aiming at exploiting their complementarity in capturing different aspects of cross-lingual similarity. We also present the continuous word alignment-based similarity analysis, a new model to estimate similarity between text fragments. We compare the aforementioned approaches with several state-of-the-art models in the task of CL plagiarism detection and study their performance in detecting different length and obfuscation types of plagiarism cases. We conduct experiments over Spanish-English and GermanEnglish datasets. Experimental results show that continuous representations allow the continuous word alignment-based similarity analysis model to obtain competitive results and the knowledge-based document similarity model to outperform the state-of-the-art in CL plagiarism detection. © 2016 Elsevier B.V. All rights reserved.	es_ES
dc.description.sponsorship	This research has been carried out in framework of the FPI-UPV pre-doctoral grant (No de registro - 3505) awarded to Parth Gupta and in the framework of the national projects DIANA-APPLICATIONS - Finding Hidden Knowledge in Texts: Applications (TIN2012-38603-C02-01), and SomEMBED: SOcial Media language understanding - EMBEDing contexts (TIN2015-71147-C2-1-P). We would like to thank Martin Potthast, Daniel Ortiz-Martinez, and Luis A. Leiva for their support and comments during this research.	en_EN
dc.language	Inglés	es_ES
dc.publisher	Elsevier	es_ES
dc.relation.ispartof	Knowledge-Based Systems	es_ES
dc.rights	Reserva de todos los derechos	es_ES
dc.subject	Cross-language	es_ES
dc.subject	Plagiarism detection	es_ES
dc.subject	Continuous representations	es_ES
dc.subject	Knowledge graphs	es_ES
dc.subject	Multilingual semantic network	es_ES
dc.subject.classification	LENGUAJES Y SISTEMAS INFORMATICOS	es_ES
dc.title	Cross-language Plagiarism Detection over Continuous-space- and Knowledge Graph-based Representations of Language	es_ES
dc.type	Artículo	es_ES
dc.identifier.doi	10.1016/j.knosys.2016.08.004
dc.relation.projectID	info:eu-repo/grantAgreement/UPV//PRE-DOCTORAL GRANT%2F3505/	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/MINECO//TIN2012-38603-C02-01/ES/DIANA-APPLICATIONS: FINDING HIDDEN KNOWLEDGE IN TEXTS: APPLICATIONS/	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/MINECO//TIN2015-71147-C2-1-P/ES/COMPRENSION DEL LENGUAJE EN LOS MEDIOS DE COMUNICACION SOCIAL - REPRESENTANDO CONTEXTOS DE FORMA CONTINUA/	es_ES
dc.rights.accessRights	Abierto	es_ES
dc.contributor.affiliation	Universitat Politècnica de València. Escola Tècnica Superior d'Enginyeria Informàtica	es_ES
dc.description.bibliographicCitation	Franco-Salvador, M.; Gupta, PA.; Rosso, P.; Banchs, R. (2016). Cross-language Plagiarism Detection over Continuous-space- and Knowledge Graph-based Representations of Language. Knowledge-Based Systems. 111:87-99. https://doi.org/10.1016/j.knosys.2016.08.004	es_ES
dc.description.accrualMethod	S	es_ES
dc.relation.publisherversion	http://dx.doi.org/10.1016/j.knosys.2016.08.004	es_ES
dc.description.upvformatpinicio	87	es_ES
dc.description.upvformatpfin	99	es_ES
dc.type.version	info:eu-repo/semantics/publishedVersion	es_ES
dc.description.volume	111	es_ES
dc.relation.senia	326671	es_ES
dc.contributor.funder	Ministerio de Economía y Competitividad	es_ES
dc.contributor.funder	Universitat Politècnica de València	es_ES

Este ítem aparece en la(s) siguiente(s) colección(ones)

Artículos, conferencias, monografías [48344]

Mostrar el registro sencillo del ítem

Cross-language Plagiarism Detection over Continuous-space- and Knowledge Graph-based Representations of Language

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Buscar en RiuNet

Listar

Todo RiuNet

Esta colección

Mi cuenta

Estadísticas

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Cross-language Plagiarism Detection over Continuous-space- and Knowledge Graph-based Representations of Language

Ficheros en el ítem

Este ítem aparece en la(s) siguiente(s) colección(ones)