Detecting Ethnicity-targeted Hate Speech in Russian Social Media Texts

Pronoza, Ekaterina; Panicheva, Polina; Koltsova, Olessia; Rosso, Paolo

doi:10.1016/j.ipm.2021.102674

Identificarse

Buscar en RiuNet

Listar

Todo RiuNet
Esta colección

Mi cuenta

Acceder

Estadísticas

Ver Estadísticas de uso

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Detecting Ethnicity-targeted Hate Speech in Russian Social Media Texts

Mostrar el registro sencillo del ítem

Ficheros en el ítem

Nombre: PronozaPanichevaK ...

Tamaño: 1.685Mb

Formato: PDF

Descripción: Versión del Autor.

Abrir

Nombre: Detecting ethnici ...

Tamaño: 3.526Mb

Formato: PDF

Descripción: Versión editorial

Solicitar una copia al autor

dc.contributor.author	Pronoza, Ekaterina	es_ES
dc.contributor.author	Panicheva, Polina	es_ES
dc.contributor.author	Koltsova, Olessia	es_ES
dc.contributor.author	Rosso, Paolo	es_ES
dc.date.accessioned	2022-06-03T18:02:18Z
dc.date.available	2022-06-03T18:02:18Z
dc.date.issued	2021-11	es_ES
dc.identifier.issn	0306-4573	es_ES
dc.identifier.uri	http://hdl.handle.net/10251/183078
dc.description.abstract	[EN] Ethnicity-targeted hate speech has been widely shown to influence on-the-ground inter-ethnic conflict and violence, especially in such multi-ethnic societies as Russia. Therefore, ethnicity-targeted hate speech detection in user texts is becoming an important task. However, it faces a number of unresolved problems: difficulties of reliable mark-up, informal and indirect ways of expressing negativity in user texts (such as irony, false generalization and attribution of unfavored actions to targeted groups), users¿ inclination to express opposite attitudes to different ethnic groups in the same text and, finally, lack of research on languages other than English. In this work we address several of these problems in the task of ethnicity-targeted hate speech detection in Russian-language social media texts. This approach allows us to differentiate between attitudes towards different ethnic groups mentioned in the same text ¿ a task that has never been addressed before. We use a dataset of over 2,6M user messages mentioning ethnic groups to construct a representative sample of 12K instances (ethnic group, text) that are further thoroughly annotated via a special procedure. In contrast to many previous collections that usually comprise extreme cases of toxic speech, representativity of our sample secures a realistic and, therefore, much higher proportion of subtle negativity which additionally complicates its automatic detection. We then experiment with four types of machine learning models, from traditional classifiers such as SVM to deep learning approaches, notably the recently introduced BERT architecture, and interpret their predictions in terms of various linguistic phenomena. In addition to hate speech detection with a text-level two-class approach (hate, no hate), we also justify and implement a unique instance-based three-class approach (positive, neutral, negative attitude, the latter implying hate speech). Our best results are achieved by using fine-tuned and pre-trained RuBERT combined with linguistic features, with F1-hate=0.760, F1-macro=0.833 on the text-level two-class problem comparable to previous studies, and F1-hate=0.813, F1-macro=0.824 on our unique instance-based three-class hate speech detection task. Finally, we perform error analysis, and it reveals that further improvement could be achieved by accounting for complex and creative language issues more accurately, i.e., by detecting irony and unconventional forms of obscene lexicon.	es_ES
dc.language	Inglés	es_ES
dc.publisher	Elsevier	es_ES
dc.relation.ispartof	Information Processing & Management	es_ES
dc.rights	Reconocimiento - No comercial - Sin obra derivada (by-nc-nd)	es_ES
dc.subject	Hate speech detection	es_ES
dc.subject	Ethnic hate	es_ES
dc.subject	Russian language	es_ES
dc.subject	Deep learning	es_ES
dc.subject.classification	LENGUAJES Y SISTEMAS INFORMATICOS	es_ES
dc.title	Detecting Ethnicity-targeted Hate Speech in Russian Social Media Texts	es_ES
dc.type	Artículo	es_ES
dc.identifier.doi	10.1016/j.ipm.2021.102674	es_ES
dc.rights.accessRights	Abierto	es_ES
dc.contributor.affiliation	Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació	es_ES
dc.description.bibliographicCitation	Pronoza, E.; Panicheva, P.; Koltsova, O.; Rosso, P. (2021). Detecting Ethnicity-targeted Hate Speech in Russian Social Media Texts. Information Processing & Management. 58(6):1-24. https://doi.org/10.1016/j.ipm.2021.102674	es_ES
dc.description.accrualMethod	S	es_ES
dc.relation.publisherversion	https://doi.org/10.1016/j.ipm.2021.102674	es_ES
dc.description.upvformatpinicio	1	es_ES
dc.description.upvformatpfin	24	es_ES
dc.type.version	info:eu-repo/semantics/publishedVersion	es_ES
dc.description.volume	58	es_ES
dc.description.issue	6	es_ES
dc.relation.pasarela	S\463418	es_ES
dc.subject.ods	04.- Garantizar una educación de calidad inclusiva y equitativa, y promover las oportunidades de aprendizaje permanente para todos	es_ES

Este ítem aparece en la(s) siguiente(s) colección(ones)

Artículos, conferencias, monografías [48344]

Mostrar el registro sencillo del ítem

Detecting Ethnicity-targeted Hate Speech in Russian Social Media Texts

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Buscar en RiuNet

Listar

Todo RiuNet

Esta colección

Mi cuenta

Estadísticas

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Detecting Ethnicity-targeted Hate Speech in Russian Social Media Texts

Ficheros en el ítem

Este ítem aparece en la(s) siguiente(s) colección(ones)