- -

AI-UPV at IberLEF-2021 DETOXIS task: Toxicity Detection in Immigration-Related Web News Comments Using Transformers and Statistical Models

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

AI-UPV at IberLEF-2021 DETOXIS task: Toxicity Detection in Immigration-Related Web News Comments Using Transformers and Statistical Models

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author Magnossao de Paula, Angel Felipe es_ES
dc.contributor.author Baris Schlicht, Ipek es_ES
dc.date.accessioned 2022-12-12T08:08:34Z
dc.date.available 2022-12-12T08:08:34Z
dc.date.issued 2021-09-21 es_ES
dc.identifier.issn 1613-0073 es_ES
dc.identifier.uri http://hdl.handle.net/10251/190553
dc.description.abstract [EN] This paper describes our participation in the DEtection of TOXicity in comments In Spanish (DETOXIS) shared task 2021 at the 3rd Workshop on Iberian Languages Evaluation Forum. The shared task is divided into two related classification tasks: (i) Task 1: toxicity detection and; (ii) Task 2: toxicity level detection. They focus on the xenophobic problem exacerbated by the spread of toxic comments posted in different online news articles related to immigration. One of the necessary efforts towards mitigating this problem is to detect toxicity in the comments. Our main objective was to implement an accurate model to detect xenophobia in comments about web news articles within the DETOXIS shared task 2021, based on the competition¿s official metrics: the F1-score for Task 1 and the Closeness Evaluation Metric (CEM) for Task 2. To solve the tasks, we worked with two types of machine learning models: (i) statistical models and (ii) Deep Bidirectional Transformers for Language Understanding (BERT) models. We obtained our best results in both tasks using BETO, a BERT model trained on a big Spanish corpus. We obtained the 3rd place in Task 1 official ranking with the F1-score of 0.5996, and we achieved the 6th place in Task 2 official ranking with the CEM of 0.7142. Our results suggest: (i) BERT models obtain better results than statistical models for toxicity detection in text comments; (ii) Monolingual BERT models have an advantage over multilingual BERT models in toxicity detection in text comments in their pre-trained language es_ES
dc.language Inglés es_ES
dc.publisher CEUR Workshop es_ES
dc.relation.ispartof Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2021) es_ES
dc.rights Reconocimiento (by) es_ES
dc.subject Spanish text classification es_ES
dc.subject Toxicity detection es_ES
dc.subject Deep learning es_ES
dc.subject Transformers es_ES
dc.subject BERT es_ES
dc.subject Statistical models es_ES
dc.title AI-UPV at IberLEF-2021 DETOXIS task: Toxicity Detection in Immigration-Related Web News Comments Using Transformers and Statistical Models es_ES
dc.type Comunicación en congreso es_ES
dc.type Artículo es_ES
dc.rights.accessRights Abierto es_ES
dc.description.bibliographicCitation Magnossao De Paula, AF.; Baris Schlicht, I. (2021). AI-UPV at IberLEF-2021 DETOXIS task: Toxicity Detection in Immigration-Related Web News Comments Using Transformers and Statistical Models. CEUR Workshop. 547-566. http://hdl.handle.net/10251/190553 es_ES
dc.description.accrualMethod S es_ES
dc.relation.conferencename Iberian Languages Evaluation Forum (IberLEF 2021) es_ES
dc.relation.conferencedate Septiembre 21-21,2021 es_ES
dc.relation.conferenceplace Online es_ES
dc.relation.publisherversion https://ceur-ws.org/Vol-2943/ es_ES
dc.description.upvformatpinicio 547 es_ES
dc.description.upvformatpfin 566 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.relation.pasarela S\450749 es_ES


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem