- -

Wikipedia vandalism detection: combining natural language, metadata, and reputation features

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

Wikipedia vandalism detection: combining natural language, metadata, and reputation features

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author Adler, B. Thomas es_ES
dc.contributor.author Alfaro, Luca de es_ES
dc.contributor.author Mola Velasco, Santiago Moisés es_ES
dc.contributor.author Rosso, Paolo es_ES
dc.contributor.author West, Andrew G es_ES
dc.date.accessioned 2014-03-25T08:24:16Z
dc.date.issued 2011
dc.identifier.isbn 978-3-642-19436-8
dc.identifier.issn 0302-9743
dc.identifier.uri http://hdl.handle.net/10251/36621
dc.description.abstract Wikipedia is an online encyclopedia which anyone can edit. While most edits are constructive, about 7% are acts of vandalism. Such behavior is characterized by modifications made in bad faith; introducing spam and other inappropriate content. In this work, we present the results of an effort to integrate three of the leading approaches to Wikipedia vandalism detection: a spatio-temporal analysis of metadata (STiki), a reputation-based system (WikiTrust), and natural language processing features. The performance of the resulting joint system improves the state-of-the-art from all previous methods and establishes a new baseline for Wikipedia vandalism detection. We examine in detail the contribution of the three approaches, both for the task of discovering fresh vandalism, and for the task of locating vandalism in the complete set of Wikipedia revisions. es_ES
dc.description.sponsorship The authors from Universitat Politècnica de València thank also the MICINN research project TEXT-ENTERPRISE 2.0 TIN2009-13391-C04-03 (Plan I+D+i). UPenn contributions were supported in part by ONR MURI N00014-07-1-0907. This research was partially supported by award 1R01GM089820-01A1 from the National Institute Of General Medical Sciences, and by ISSDM, a UCSC-LANL educational collaboration. es_ES
dc.format.extent 12 es_ES
dc.language Inglés es_ES
dc.publisher Springer Verlag (Germany) es_ES
dc.relation.ispartof Computational Linguistics and Intelligent Text Processing es_ES
dc.relation.ispartofseries Lecture Notes in Computer Science;vol. 6009
dc.rights Reserva de todos los derechos es_ES
dc.subject Database Management es_ES
dc.subject Data Mining and Knowledge Discovery es_ES
dc.subject.classification LENGUAJES Y SISTEMAS INFORMATICOS es_ES
dc.title Wikipedia vandalism detection: combining natural language, metadata, and reputation features es_ES
dc.type Capítulo de libro es_ES
dc.embargo.lift 10000-01-01
dc.embargo.terms forever es_ES
dc.identifier.doi 10.1007/978-3-642-19437-5_23
dc.relation.projectID info:eu-repo/grantAgreement/MICINN//TIN2009-13391-C04-03/ES/Text-Enterprise 2.0: Tecnicas De Comprension De Textos Aplicadas A Las Necesidades De La Empresa 2.0/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/NIH//1R01GM089820-01A1/US/The Gene Wiki: Community intelligence applied to gene annotation/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/ONR//N00014-07-1-0907/ es_ES
dc.rights.accessRights Abierto es_ES
dc.contributor.affiliation Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació es_ES
dc.description.bibliographicCitation Adler, BT.; Alfaro, LD.; Mola Velasco, SM.; Rosso, P.; West, AG. (2011). Wikipedia vandalism detection: combining natural language, metadata, and reputation features. En Computational Linguistics and Intelligent Text Processing. Springer Verlag (Germany). 6609:277-288. https://doi.org/10.1007/978-3-642-19437-5_23 es_ES
dc.description.accrualMethod S es_ES
dc.relation.conferencename 12th International Conference, CICLing 2011 es_ES
dc.relation.conferencedate February 20-26, 2011 es_ES
dc.relation.conferenceplace Tokyo, Japan es_ES
dc.relation.publisherversion http://link.springer.com/chapter/10.1007/978-3-642-19437-5_23 es_ES
dc.description.upvformatpinicio 277 es_ES
dc.description.upvformatpfin 288 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.description.volume 6609 es_ES
dc.relation.senia 215394
dc.contributor.funder Ministerio de Ciencia e Innovación es_ES
dc.contributor.funder National Institutes of Health, EEUU es_ES
dc.contributor.funder Office of Naval Research es_ES
dc.contributor.funder National Institute of General Medical Sciences, EEUU es_ES
dc.contributor.funder Institute for Scalable Scientific Data Management es_ES
dc.description.references Wikimedia Foundation: Wikipedia (2010) [Online; accessed December 29, 2010] es_ES
dc.description.references Wikimedia Foundation: Wikistats (2010) [Online; accessed December 29, 2010] es_ES
dc.description.references Potthast, M.: Crowdsourcing a Wikipedia Vandalism Corpus. In: Proc. of the 33rd Intl. ACM SIGIR Conf. (SIGIR 2010). ACM Press, New York (July 2010) es_ES
dc.description.references Gralla, P.: U.S. senator: It’s time to ban Wikipedia in schools, libraries, http://blogs.computerworld.com/4598/u_s_senator_its_time_to_ban_wikipedia_in_schools_libraries [Online; accessed November 15, 2010] es_ES
dc.description.references Olanoff, L.: School officials unite in banning Wikipedia. Seattle Times (November 2007) es_ES
dc.description.references Mola-Velasco, S.M.: Wikipedia Vandalism Detection Through Machine Learning: Feature Review and New Proposals. In: Braschler, M., Harman, D. (eds.) Notebook Papers of CLEF 2010 LABs and Workshops, Padua, Italy, September 22-23 (2010) es_ES
dc.description.references Adler, B., de Alfaro, L., Pye, I.: Detecting Wikipedia Vandalism using WikiTrust. In: Braschler, M., Harman, D. (eds.) Notebook Papers of CLEF 2010 LABs and Workshops, Padua, Italy, September 22-23 (2010) es_ES
dc.description.references West, A.G., Kannan, S., Lee, I.: Detecting Wikipedia Vandalism via Spatio-Temporal Analysis of Revision Metadata. In: EUROSEC 2010: Proceedings of the Third European Workshop on System Security, pp. 22–28 (2010) es_ES
dc.description.references West, A.G.: STiki: A Vandalism Detection Tool for Wikipedia (2010), http://en.wikipedia.org/wiki/Wikipedia:STiki es_ES
dc.description.references Wikipedia: User: AntiVandalBot – Wikipedia, http://en.wikipedia.org/wiki/User:AntiVandalBot (2010) [Online; accessed November 2, 2010] es_ES
dc.description.references Wikipedia: User:MartinBot – Wikipedia (2010), http://en.wikipedia.org/wiki/User:MartinBot [Online; accessed November 2, 2010] es_ES
dc.description.references Wikipedia: User:ClueBot – Wikipedia (2010), http://en.wikipedia.org/wiki/User:ClueBot [Online; accessed November 2, 2010] es_ES
dc.description.references Carter, J.: ClueBot and Vandalism on Wikipedia (2008), http://www.acm.uiuc.edu/~carter11/ClueBot.pdf [Online; accessed November 2, 2010] es_ES
dc.description.references Rodríguez Posada, E.J.: AVBOT: detección y corrección de vandalismos en Wikipedia. NovATIca (203), 51–53 (2010) es_ES
dc.description.references Potthast, M., Stein, B., Gerling, R.: Automatic Vandalism Detection in Wikipedia. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 663–668. Springer, Heidelberg (2008) es_ES
dc.description.references Smets, K., Goethals, B., Verdonk, B.: Automatic Vandalism Detection in Wikipedia: Towards a Machine Learning Approach. In: WikiAI 2008: Proceedings of the Workshop on Wikipedia and Artificial Intelligence: An Evolving Synergy, pp. 43–48. AAAI Press, Menlo Park (2008) es_ES
dc.description.references Druck, G., Miklau, G., McCallum, A.: Learning to Predict the Quality of Contributions to Wikipedia. In: WikiAI 2008: Proceedings of the Workshop on Wikipedia and Artificial Intelligence: An Evolving Synergy, pp. 7–12. AAAI Press, Menlo Park (2008) es_ES
dc.description.references Itakura, K.Y., Clarke, C.L.: Using Dynamic Markov Compression to Detect Vandalism in the Wikipedia. In: SIGIR 2009: Proc. of the 32nd Intl. ACM Conference on Research and Development in Information Retrieval, pp. 822–823 (2009) es_ES
dc.description.references Chin, S.C., Street, W.N., Srinivasan, P., Eichmann, D.: Detecting Wikipedia Vandalism with Active Learning and Statistical Language Models. In: WICOW 2010: Proc. of the 4th Workshop on Information Credibility on the Web (April 2010) es_ES
dc.description.references Zeng, H., Alhoussaini, M., Ding, L., Fikes, R., McGuinness, D.: Computing Trust from Revision History. In: Intl. Conf. on Privacy, Security and Trust (2006) es_ES
dc.description.references McGuinness, D., Zeng, H., da Silva, P., Ding, L., Narayanan, D., Bhaowal, M.: Investigation into Trust for Collaborative Information Repositories: A Wikipedia Case Study. In: Proc. of the Workshop on Models of Trust for the Web (2006) es_ES
dc.description.references Adler, B., de Alfaro, L.: A Content-Driven Reputation System for the Wikipedia. In: WWW 2007: Proceedings of the 16th International World Wide Web Conference. ACM Press, New York (2007) es_ES
dc.description.references Belani, A.: Vandalism Detection in Wikipedia: a Bag-of-Words Classifier Approach. Computing Research Repository (CoRR) abs/1001.0700 (2010) es_ES
dc.description.references Potthast, M., Stein, B., Holfeld, T.: Overview of the 1st International Competition on Wikipedia Vandalism Detection. In: Braschler, M., Harman, D. (eds.) Notebook Papers of CLEF 2010 LABs and Workshops, Padua, Italy, September 22-23 (2010) es_ES
dc.description.references Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.: The WEKA Data Mining Software: An Update. SIGKDD Explorations 11(1) (2009) es_ES
dc.description.references Breiman, L.: Random Forests. Machine Learning 45(1), 5–32 (2001) es_ES
dc.description.references Davis, J., Goadrich, M.: The relationship between Precision-Recall and ROC curves. In: ICML 2006: Proc. of the 23rd Intl. Conf. on Machine Learning (2006) es_ES


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem