- -

Methods for cross-language plagiarism detection

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

Methods for cross-language plagiarism detection

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author Barrón Cedeño, Luis Alberto es_ES
dc.contributor.author Gupta, Parth Alokkumar es_ES
dc.contributor.author Rosso ., Paolo es_ES
dc.date.accessioned 2014-07-01T17:00:41Z
dc.date.issued 2013-09
dc.identifier.issn 0950-7051
dc.identifier.uri http://hdl.handle.net/10251/38501
dc.description NOTICE: this is the author's version (pre print) of a work that was accepted for publication in Knowledge-Based Systems. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Knowledge-Based Systems. 50:211-217. doi:10.1016/j.knosys.2013.06.018. es_ES
dc.description.abstract Three reasons make plagiarism across languages to be on the rise: (i) speakers of under-resourced languages often consult documentation in a foreign language, (ii) people immersed in a foreign country can still consult material written in their native language, and (iii) people are often interested in writing in a language different to their native one. Most efforts for automatically detecting cross-language plagiarism depend on a preliminary translation, which is not always available. In this paper we propose a freely available architecture for plagiarism detection across languages covering the entire process: heuristic retrieval, detailed analysis, and post-processing. On top of this architecture we explore the suitability of three cross-language similarity estimation models: Cross-Language Alignment-based Similarity Analysis (CL-ASA), Cross-Language Character n-Grams (CL-CNG), and Translation plus Monolingual Analysis (T + MA); three inherently different models in nature and required resources. The three models are tested extensively under the same conditions on the different plagiarism detection sub-tasks¿something never done before. The experiments show that T + MA produces the best results, closely followed by CL-ASA. Still CL-ASA obtains higher values of precision, an important factor in plagiarism detection when lesser user intervention is desired. es_ES
dc.format.extent 7 es_ES
dc.language Inglés es_ES
dc.publisher Elsevier es_ES
dc.relation.ispartof Knowledge-Based Systems es_ES
dc.rights Reserva de todos los derechos es_ES
dc.subject Automatic plagiarism detection es_ES
dc.subject Cross-language plagiarism es_ES
dc.subject Plagiarism detection architecture es_ES
dc.subject Cross-language similarity es_ES
dc.subject Text re-use analysis es_ES
dc.subject.classification LENGUAJES Y SISTEMAS INFORMATICOS es_ES
dc.title Methods for cross-language plagiarism detection es_ES
dc.type Artículo es_ES
dc.embargo.lift 10000-01-01
dc.embargo.terms forever es_ES
dc.identifier.doi 10.1016/j.knosys.2013.06.018
dc.rights.accessRights Abierto es_ES
dc.contributor.affiliation Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació es_ES
dc.description.bibliographicCitation Barrón Cedeño, LA.; Gupta, PA.; Rosso ., P. (2013). Methods for cross-language plagiarism detection. Knowledge-Based Systems. 50:211-217. doi:10.1016/j.knosys.2013.06.018 es_ES
dc.description.accrualMethod S es_ES
dc.relation.publisherversion http://dx.doi.org/10.1016/j.knosys.2013.06.018 es_ES
dc.description.upvformatpinicio 211 es_ES
dc.description.upvformatpfin 217 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.description.volume 50 es_ES
dc.relation.senia 255664


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem