- -

PAN@FIRE: Overview of the cross-language !ndian Text re-use detection competition

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

PAN@FIRE: Overview of the cross-language !ndian Text re-use detection competition

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author Barrón Cedeño, Luis Alberto es_ES
dc.contributor.author Rosso ., Paolo es_ES
dc.contributor.author Sobha, Lalitha Devi es_ES
dc.contributor.author Clough ., Paul es_ES
dc.contributor.author Stevenson ., Mark es_ES
dc.date.accessioned 2014-07-08T17:47:59Z
dc.date.issued 2013
dc.identifier.isbn 978-3-642-40086-5
dc.identifier.issn 0302-9743
dc.identifier.uri http://hdl.handle.net/10251/38682
dc.description The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-40087-2_6 es_ES
dc.description.abstract The development of models for automatic detection of text re-use and plagiarism across languages has received increasing attention in recent years. However, the lack of an evaluation framework composed of annotated datasets has caused these efforts to be isolated. In this paper we present the CL!TR 2011 corpus, the first manually created corpus for the analysis of cross-language text re-use between English and Hindi. The corpus was used during the Cross-Language !ndian Text Re-Use Detection Competition. Here we overview the approaches applied the contestants and evaluate their quality when detecting a re-used text together with its source. es_ES
dc.description.sponsorship This research work is partially funded by the WIQ-EI (IRSES grant n. 269180)and ACCURAT (grant n. 248347) projects, and the Seventh Framework Programme (FP7/2007-2013) under grant agreement n. 246016 from the European Union. The first author was partially funded by the CONACyT-Mexico 192021 grant and currently works under the ERCIM “Alain Bensoussan” Fellowship Programme. The research of the second author is in the framework of the VLC/Campus Microcluster on Multimodal Interaction in Intelligent Systems and partially funded by the MICINN research project TEXT-ENTERPRISE 2.0 TIN2009-13391-C04-03 (plan I+D+i). The research from AU-KBC Centre is supported by the Cross Lingual Information Access (CLIA) Phase II Project.
dc.format.extent 12 es_ES
dc.language Inglés es_ES
dc.publisher Springer Verlag (Germany) es_ES
dc.relation.ispartof Multilingual Information Access in South Asian Languages es_ES
dc.relation.ispartofseries Lecture Notes in Computer Science;
dc.rights Reserva de todos los derechos es_ES
dc.subject.classification LENGUAJES Y SISTEMAS INFORMATICOS es_ES
dc.title PAN@FIRE: Overview of the cross-language !ndian Text re-use detection competition es_ES
dc.type Capítulo de libro es_ES
dc.identifier.doi 10.1007/978-3-642-40087-2_6
dc.relation.projectID info:eu-repo/grantAgreement/EC/FP7/246016/EU/Alain Bensoussan Career Development Enhancer/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/CONACyT//192021/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/EC/FP7/269180/EU/Web Information Quality Evaluation Initiative/
dc.relation.projectID info:eu-repo/grantAgreement/MICINN//TIN2009-13391-C04-03/ES/Text-Enterprise 2.0: Tecnicas De Comprension De Textos Aplicadas A Las Necesidades De La Empresa 2.0/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/EC/FP7/248347/EU/Analysis and Evaluation of Comparable Corpora for Under Resourced Areas of Machine Translation/
dc.relation.projectID info:eu-repo/grantAgreement/EC/FP7/246016 /EU/
dc.rights.accessRights Abierto es_ES
dc.contributor.affiliation Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació es_ES
dc.description.bibliographicCitation Barrón Cedeño, LA.; Rosso ., P.; Sobha, LD.; Clough ., P.; Stevenson ., M. (2013). PAN@FIRE: Overview of the cross-language !ndian Text re-use detection competition. En Multilingual Information Access in South Asian Languages. Springer Verlag (Germany). 7536:59-70. https://doi.org/10.1007/978-3-642-40087-2_6 es_ES
dc.description.accrualMethod S es_ES
dc.relation.conferencename Second International Workshop, FIRE 2010 es_ES
dc.relation.conferencedate February 19-21, 2010 es_ES
dc.relation.conferenceplace Gandhinagar, India es_ES
dc.relation.publisherversion http://link.springer.com/chapter/10.1007/978-3-642-40087-2_6 es_ES
dc.description.upvformatpinicio 59 es_ES
dc.description.upvformatpfin 70 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.description.volume 7536 es_ES
dc.relation.senia 255810
dc.contributor.funder European Commission
dc.contributor.funder Consejo Nacional de Ciencia y Tecnología, México
dc.contributor.funder European Research Consortium for Informatics and Mathematics
dc.contributor.funder Ministerio de Ciencia e Innovación
dc.contributor.funder Department of Electronics and Information Technology, Ministry of Communications and Information Technology, India
dc.description.references Addanki, K., Wu, D.: An Evaluation of MT Alignment Baseline Approaches upon Cross-Lingual Plagiarism Detection. In: FIRE [12] es_ES
dc.description.references Aggarwal, N., Asooja, K., Buitelaar, P.: Cross Lingual Text Reuse Detection Using Machine Translation & Similarity Measures. In: FIRE [12] es_ES
dc.description.references Alegria, I., Forcada, M., Sarasola, K. (eds.): Proceedings of the SEPLN 2009 Workshop on Information Retrieval and Information Extraction for Less Resourced Languages. University of the Basque Country, Donostia, Donostia (2009) es_ES
dc.description.references Barrón-Cedeño, A., Rosso, P., Pinto, D., Juan, A.: On Cross-Lingual Plagiarism Analysis Using a Statistical Model. In: Stein, B., Stamatatos, E., Koppel, M. (eds.) ECAI 2008 Workshop on Uncovering Plagiarism, Authorship, and Social Software Misuse (PAN 2008), vol. 377, pp. 9–13. CEUR-WS.org, Patras (2008), http://ceur-ws.org/Vol-377 es_ES
dc.description.references Bendersky, M., Croft, W.: Finding Text Reuse on the Web. In: Baeza-Yates, R., Boldi, P., Ribeiro-Neto, B., Cambazoglu, B. (eds.) Proceedings of the Second ACM International Conference on Web Search and Web Data Mining, pp. 262–271. ACM, Barcelona (2009) es_ES
dc.description.references Ceska, Z., Toman, M., Jezek, K.: Multilingual Plagiarism Detection. In: Proceedings of the 13th International Conference on Artificial Intelligence (ICAI 2008), pp. 83–92. Springer, Varna (2008) es_ES
dc.description.references Clough, P.: Plagiarism in Natural and Programming Languages: an Overview of Current Tools and Technologies. Research Memoranda: CS-00-05, Department of Computer Science. University of Sheffield, UK (2000) es_ES
dc.description.references Clough, P.: Old and new challenges in automatic plagiarism detection. National UK Plagiarism Advisory Service (2003), http://ir.shef.ac.uk/cloughie/papers/pasplagiarism.pdf es_ES
dc.description.references Clough, P., Gaizauskas, R.: Corpora and Text Re-Use. In: Lüdeling, A., Kytö, M., McEnery, T. (eds.) Handbook of Corpus Linguistics. Handbooks of Linguistics and Communication Science, pp. 1249–1271. Mouton de Gruyter (2009) es_ES
dc.description.references Clough, P., Stevenson, M.: Developing a Corpus of Plagiarised Examples. Language Resources and Evaluation 45(1), 5–24 (2011) es_ES
dc.description.references Comas, R., Sureda, J.: Academic Cyberplagiarism: Tracing the Causes to Reach Solutions. In: Comas, R., Sureda, J. (eds.) Academic Cyberplagiarism [online dossier], Digithum. Iss, vol. 10, pp. 1–6. UOC (2008), http://bit.ly/cyberplagiarism_cs es_ES
dc.description.references Majumder, P., Mitra, M., Bhattacharyya, P., Subramaniam, L., Contractor, D., Rosso, P. (eds.): FIRE 2010 and 2011. LNCS, vol. 7536. Springer, Heidelberg (2013) es_ES
dc.description.references Gale, W., Church, K.: A Program for Aligning Sentences in Bilingual Corpora. Computational Linguistics 19, 75–102 (1993) es_ES
dc.description.references Ghosh, A., Bhaskar, P., Pal, S., Bandyopadhyay, S.: Rule Based Plagiarism Detection using Information Retrieval. In: Petras, et al. [24] es_ES
dc.description.references Gupta, P., Singhal, K.: Mapping Hindi-English Text Re-use Document Pairs. In: FIRE [12] es_ES
dc.description.references Head, A.: How today’s college students use Wikipedia for course-related research. First Monday 15(3) (March 2010), http://www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/2830/2476 es_ES
dc.description.references IEEE: A Plagiarism FAQ (2008), http://bit.ly/ieee_plagiarism (published: 2008; accessed March 3, 2010) es_ES
dc.description.references Kulathuramaiyer, N., Maurer, H.: Coping With the Copy-Paste-Syndrome. In: Proceedings of World Conference on E-Learning in Corporate, Government, Healthcare, and Higher Education 2007 (E-Learn 2007), pp. 1072–1079. AACE, Quebec City (2007) es_ES
dc.description.references Lee, C., Wu, C., Yang, H.: A Platform Framework for Cross-lingual Text Relatedness Evaluation and Plagiarism Detection. In: Proceedings of the 3rd International Conference on Innovative Computing Information (ICICIC 2008). IEEE Computer Society (2008) es_ES
dc.description.references Martínez, I.: Wikipedia Usage by Mexican Students. The Constant Usage of Copy and Paste. In: Wikimania 2009, Buenos Aires, Argentina (2009), http://wikimania2009.wikimedia.org es_ES
dc.description.references Maurer, H., Kappe, F., Zaka, B.: Plagiarism - a survey. Journal of Universal Computer Science 12(8), 1050–1084 (2006) es_ES
dc.description.references Palkovskii, Y., Belov, A.: Exploring Cross Lingual Plagiarism Detection in Hindi-English with n-gram Fingerprinting and VSM based Similarity Detection. In: FIRE [12] es_ES
dc.description.references Palkovskii, Y., Belov, A., Muzika, I.: Using WordNet-based Semantic Similarity Measurement in External Plagiarism Detection - Notebook for PAN at CLEF 2011. In: Petras, et al. [24] es_ES
dc.description.references Petras, V., Forner, P., Clough, P. (eds.): Notebook Papers of CLEF 2011 LABs and Workshops, Amsterdam, The Netherlands (September 2011) es_ES
dc.description.references Potthast, M., Stein, B., Eiselt, A., Barrón-Cedeño, A., Rosso, P.: Overview of the 1st international competition on plagiarism detection. In: Stein, B., Rosso, P., Stamatatos, E., Koppel, M., Agirre, E. (eds.) SEPLN 2009 Workshop on Uncovering Plagiarism, Authorship, and Social Software Misuse (PAN 2009), vol. 502, pp. 1–9. CEUR-WS.org, San Sebastian (2009), http://ceur-ws.org/Vol-502 es_ES
dc.description.references Potthast, M., Barrón-Cedeño, A., Stein, B., Rosso, P.: Cross-Language Plagiarism Detection. Language Resources and Evaluation (LRE), Special Issue on Plagiarism and Authorship Analysis 45(1), 1–18 (2011) es_ES
dc.description.references Potthast, M., Eiselt, A., Barrón-Cedeño, A., Stein, B., Rosso, P.: Overview of the 3rd International Competition on Plagiarism Detection. In: Petras, et al. [24] es_ES
dc.description.references Potthast, M., Stein, B., Barrón-Cedeño, A., Rosso, P.: An Evaluation Framework for Plagiarism Detection. In: Huang, C.R., Jurafsky, D. (eds.) Proceedings of the 23rd International Conference on Computational Linguistics (COLING 2010), pp. 997–1005. COLING 2010 Organizing Committee, Beijing (2010) es_ES
dc.description.references Potthast, M., Barrón-Cedeño, A., Eiselt, A., Stein, B., Rosso, P.: Overview of the 2nd International Competition on Plagiarism Detection. In: Braschler, M., Harman, D. (eds.) Notebook Papers of CLEF 2010 LABs and Workshops, Padua, Italy (September 2010) es_ES
dc.description.references Rambhoopal, K., Varma, V.: Cross-Lingual Text Reuse Detection Based On Keyphrase Extraction and Similarity Measures. In: FIRE [12] es_ES
dc.description.references Weber, S.: Das Google-Copy-Paste-Syndrom. Wie Netzplagiate Ausbildung und Wissen gefahrden. Telepolis (2007) es_ES


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem