- -

Determining and Characterizing the Reused Text for Plagiarism Detection

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

Determining and Characterizing the Reused Text for Plagiarism Detection

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author Sánchez-Vega, Fernando es_ES
dc.contributor.author Villatoro-Tello, Esaú es_ES
dc.contributor.author Montes-y-Gómez, Manuel es_ES
dc.contributor.author Villaseñor-Pineda; Luis es_ES
dc.contributor.author Rosso, Paolo es_ES
dc.date.accessioned 2014-06-23T10:26:58Z
dc.date.issued 2013-04
dc.identifier.issn 0957-4174
dc.identifier.uri http://hdl.handle.net/10251/38255
dc.description.abstract An important task in plagiarism detection is determining and measuring similar text portions between a given pair of documents. One of the main difficulties of this task resides on the fact that reused text is commonly modified with the aim of covering or camouflaging the plagiarism. Another difficulty is that not all similar text fragments are examples of plagiarism, since thematic coincidences also tend to produce portions of similar text. In order to tackle these problems, we propose a novel method for detecting likely portions of reused text. This method is able to detect common actions performed by plagiarists such as word deletion, insertion and transposition, allowing to obtain plausible portions of reused text. We also propose representing the identified reused text by means of a set of features that denote its degree of plagiarism, relevance and fragmentation. This new representation aims to facilitate the recognition of plagiarism by considering diverse characteristics of the reused text during the classification phase. Experimental results employing a supervised classification strategy showed that the proposed method is able to outperform traditionally used approaches. 2012 Elsevier Ltd. All rights reserved. es_ES
dc.description.sponsorship This work was done under partial support of CONACyT project Grants: 134186, and Scholarships: 258345/224483. This work is the result of the collaboration in the framework of the WIQEI IRSES project (Grant No. 269180) within the FP 7 Marie Curie. The work of the last author was in the framework of the VLC/CAMPUS Microcluster on Multimodal Interaction in Intelligent Systems. en_EN
dc.format.extent 10 es_ES
dc.language Inglés es_ES
dc.publisher Elsevier es_ES
dc.relation.ispartof Expert Systems with Applications es_ES
dc.rights Reserva de todos los derechos es_ES
dc.subject Plagiarism detection es_ES
dc.subject Text reuse es_ES
dc.subject Machine learning es_ES
dc.subject Supervised classification es_ES
dc.subject.classification LENGUAJES Y SISTEMAS INFORMATICOS es_ES
dc.title Determining and Characterizing the Reused Text for Plagiarism Detection es_ES
dc.type Artículo es_ES
dc.identifier.doi 10.1016/j.eswa.2012.09.021
dc.relation.projectID info:eu-repo/grantAgreement/CONACYT//134186/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/EC/FP7/269180/EU/Web Information Quality Evaluation Initiative/ en_EN
dc.relation.projectID info:eu-repo/grantAgreement/CONACYT//258345%2F224483/ es_ES
dc.rights.accessRights Abierto es_ES
dc.contributor.affiliation Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació es_ES
dc.description.bibliographicCitation Sánchez-Vega, F.; Villatoro-Tello, E.; Montes-Y-Gómez, M.; Villaseñor-Pineda; Luis; Rosso, P. (2013). Determining and Characterizing the Reused Text for Plagiarism Detection. Expert Systems with Applications. 40(5):1804-1813. https://doi.org/10.1016/j.eswa.2012.09.021 es_ES
dc.description.accrualMethod S es_ES
dc.relation.publisherversion http://dx.doi.org/10.1016/j.eswa.2012.09.021 es_ES
dc.description.upvformatpinicio 1804 es_ES
dc.description.upvformatpfin 1813 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.description.volume 40 es_ES
dc.description.issue 5 es_ES
dc.relation.senia 255772
dc.contributor.funder Consejo Nacional de Ciencia y Tecnología, México es_ES
dc.contributor.funder European Commission


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem