Transformer-based models for multimodal irony detection

Tomás, David; Ortega-Bueno, Reynier; Zhang, Guobiao; Rosso, Paolo; Schifanella, Rossano

doi:10.1007/s12652-022-04447-y

Identificarse

Buscar en RiuNet

Listar

Todo RiuNet
Esta colección

Mi cuenta

Acceder

Estadísticas

Ver Estadísticas de uso

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Transformer-based models for multimodal irony detection

Mostrar el registro completo del ítem

Tomás, D.; Ortega-Bueno, R.; Zhang, G.; Rosso, P.; Schifanella, R. (2023). Transformer-based models for multimodal irony detection. Journal of Ambient Intelligence and Humanized Computing. 14:7399-7410. https://doi.org/10.1007/s12652-022-04447-y

Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10251/195757

Ficheros en el ítem

Nombre: TomasOrtega-BuenoZhang ...

Tamaño: 896.2Kb

Formato: PDF

Descripción: Versión editorial

Abrir/Preview

Metadatos del ítem

Título:

Transformer-based models for multimodal irony detection

Autor:

Tomás, David

Ortega-Bueno, Reynier Zhang, Guobiao

Rosso, Paolo Schifanella, Rossano

Entidad UPV:

Universitat Politècnica de València. Escola Tècnica Superior d'Enginyeria Informàtica

Fecha difusión:

2023-06

Resumen:

[EN] Irony is nowadays a pervasive phenomenon in social networks. The multimodal functionalities of these platforms (i.e., the possibility to attach audio, video, and images to textual information) are increasingly leading ...[+]

Palabras clave:

Irony detection , Transformer , Multimodality , Image text fusion

Derechos de uso:

Reconocimiento (by)

Fuente:

Journal of Ambient Intelligence and Humanized Computing. (eissn: 1868-5145 )

DOI:

10.1007/s12652-022-04447-y

Editorial:

Springer

Versión del editor:

https://doi.org/10.1007/s12652-022-04447-y

Código del Proyecto:

info:eu-repo/grantAgreement/MICINN//PID2021-122263OB-C22/

Agradecimientos:

Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. This work was partially supported by the Spanish Ministry of Science and Innovation and Fondo Europeo de Desarrollo Regional (FEDER) in ...[+]

Tipo:

Artículo

References

Agarap AF (2018) Deep learning using rectified linear units (ReLU). arXiv:1803.08375

Alam F, Cresci S, Chakraborty T, et al (2021) A survey on multimodal disinformation detection. arXiv:2103.12541

Cai Y, Cai H, Wan X (2019) Multi-modal sarcasm detection in Twitter with hierarchical fusion model. In: Proceedings of the 57th annual meeting of the ACL. Association for Computational Linguistics, pp 2506–2515. https://doi.org/10.18653/v1/P19-1239

Cignarella AT, Basile V, Sanguinetti M, et al (2020a) Multilingual irony detection with dependency syntax and neural models. In: Proceedings of the 28th international conference on computational linguistics. International Committee on Computational Linguistics, Barcelona, Spain (Online), pp 1346–1358. https://doi.org/10.18653/v1/2020.coling-main.116

Cignarella AT, Sanguinetti M, Bosco C, et al (2020b) Marking irony activators in a Universal Dependencies treebank: the case of an Italian Twitter corpus. In: Proceedings of the 12th language resources and evaluation conference. European Language Resources Association, Marseille, France, pp 5098–5105. https://aclanthology.org/2020.lrec-1.627

Conneau A, Khandelwal K, Goyal N, et al (2020) Unsupervised cross-lingual representation learning at scale. In: Proceedings of the 58th annual meeting of the association for computational linguistics. Association for Computational Linguistics, Online, pp 8440–8451. https://doi.org/10.18653/v1/2020.acl-main.747

Devlin J, Chang MW, Lee K, et al (2019) Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the north American chapter of the association for computational linguistics. Association for Computational Linguistics, pp 4171–4186. https://doi.org/10.18653/v1/N19-1423

Dosovitskiy A, Beyer L, Kolesnikov A, et al (2021) An image is worth 16x16 words: transformers for image recognition at scale. In: International conference on learning representations, pp 1–21. https://openreview.net/forum?id=YicbFdNTTy

Gadzicki K, Khamsehashari R, Zetzsche C (2020) Early vs late fusion in multimodal convolutional neural networks. In: 2020 IEEE 23rd international conference on information fusion (FUSION), pp 1–6. https://doi.org/10.23919/FUSION45008.2020.9190246

Giachanou A, Zhang G, Rosso P (2020) Multimodal fake news detection with textual, visual and semantic information. Text, speech, and dialogue. Springer, Cham, pp 30–38

Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT press, Oxford

He P, Liu X, Gao J, et al (2020) Deberta: decoding-enhanced BERT with disentangled attention. arXiv:2006.03654

Howard J, Ruder S (2018) Universal language model fine-tuning for text classification. In: Proceedings of the 56th annual meeting of the ACL. Association for Computational Linguistics, pp 328–339. https://doi.org/10.18653/v1/P18-1031

Hutto C, Gilbert E (2014) Vader: a parsimonious rule-based model for sentiment analysis of social media text. Proc Int AAAI Conf Web Soc Media 8(1):216–225

Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning, PMLR, pp 448–456

Joshi A, Bhattacharyya P, Carman MJ (2017) Automatic sarcasm detection: a survey. ACM Comput Surv 50(5):1–22. https://doi.org/10.1145/3124420

Kiela D, Firooz H, Mohan A, et al (2021) The hateful memes challenge: competition report. In: Escalante HJ, Hofmann K (eds) Proceedings of the NeurIPS 2020 competition and demonstration track, proceedings of machine learning research, vol 133. PMLR, pp 344–360

Li LH, Yatskar M, Yin D, et al (2019) Visualbert: a simple and performant baseline for vision and language. arXiv:1908.03557

Liu Y, Ott M, Goyal N, et al (2019) Roberta: a robustly optimized bert pretraining approach, pp 1–13. arXiv preprint arXiv:1907.11692

Maas AL, Hannun AY, Ng AY (2013) Rectifier nonlinearities improve neural network acoustic models. In: Proceedings of the ICML workshop on deep learning for audio, speech and language processing, Atlanta, Georgia, USA, pp 1–6

Mikolov T, Sutskever I, Chen K, et al (2013) Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th international conference on neural information processing systems, vol 2. Curran Associates Inc., NIPS’13, pp 3111–3119

Naseer M, Ranasinghe K, Khan S, et al (2021) Intriguing properties of vision transformers. arXiv:2105.10497

Nguyen DQ, Vu T, Tuan Nguyen A (2020) BERTweet: a pre-trained language model for English tweets. In: Proceedings of the 2020 conference on empirical methods in natural language processing: system demonstrations. Association for Computational Linguistics, Online, pp 9–14. https://doi.org/10.18653/v1/2020.emnlp-demos.2, https://aclanthology.org/2020.emnlp-demos.2

Ojala T, Pietikainen M, Maenpaa T (2002) Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Pattern Anal Mach Intell 24(7):971–987. https://doi.org/10.1109/TPAMI.2002.1017623

Pan H, Lin Z, Fu P, et al (2020) Modeling intra and inter-modality incongruity for multi-modal sarcasm detection. In: Findings of the association for computational linguistics: EMNLP 2020. Association for Computational Linguistics, pp 1383–1392. https://doi.org/10.18653/v1/2020.findings-emnlp.124

Schifanella R, de Juan P, Tetreault J, et al (2016) Detecting sarcasm in multimodal social platforms. In: Proceedings of the 24th ACM international conference on multimedia. Association for Computing Machinery, New York, NY, USA, MM ’16, pp 1136–1145. https://doi.org/10.1145/2964284.2964321

Srivastava N, Hinton G, Krizhevsky A et al (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958

Tan H, Bansal M (2019) LXMERT: learning cross-modality encoder representations from transformers. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP). Association for Computational Linguistics, pp 5100–5111. https://doi.org/10.18653/v1/D19-1514

Van Hee C, Lefever E, Hoste V (2018) SemEval-2018 task 3: irony detection in English tweets. In: Proceedings of The 12th international workshop on semantic evaluation. Association for Computational Linguistics, New Orleans, Louisiana, pp 39–50. https://doi.org/10.18653/v1/S18-1005, https://aclanthology.org/S18-1005

Vaswani A, Shazeer N, Parmar N, et al (2017) Attention is all you need. In: Advances in neural information processing systems, vol 30. Curran Associates, Inc., pp 5998–6008

Wang X, Sun X, Yang T, et al (2020) Building a bridge: a method for image-text sarcasm detection without pretraining on image-text data. In: Proceedings of the first international workshop on natural language processing beyond text. Association for Computational Linguistics, pp 19–29. https://doi.org/10.18653/v1/2020.nlpbt-1.3

Xu N, Zeng Z, Mao W (2020) Reasoning with multimodal sarcastic tweets via modeling cross-modality contrast and semantic association. In: Proceedings of the 58th annual meeting of the association for computational linguistics. Association for Computational Linguistics, pp 3777–3786. https://doi.org/10.18653/v1/2020.acl-main.349

[-]

recommendations

Este ítem aparece en la(s) siguiente(s) colección(ones)

Artículos, conferencias, monografías [45942]

Mostrar el registro completo del ítem

Transformer-based models for multimodal irony detection

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Buscar en RiuNet

Listar

Todo RiuNet

Esta colección

Mi cuenta

Estadísticas

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Transformer-based models for multimodal irony detection

Ficheros en el ítem

Metadatos del ítem

References

recommendations

Este ítem aparece en la(s) siguiente(s) colección(ones)