- -

Sentiment Analysis and Stance Detection on German YouTube Comments on Gender Diversity

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

Sentiment Analysis and Stance Detection on German YouTube Comments on Gender Diversity

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author Melnyk, Lidiia es_ES
dc.contributor.author Feld, Linda es_ES
dc.date.accessioned 2023-01-09T08:29:28Z
dc.date.available 2023-01-09T08:29:28Z
dc.date.issued 2022-11-23
dc.identifier.uri http://hdl.handle.net/10251/191102
dc.description.abstract [EN] This paper explores different options of detecting the stance of German YouTube comments regarding the topic of gender diversity and compares the respective results with those of sentiment analysis, showing that these are two very different NLP tasks focusing on distinct characteristics of the discourse. While an already existing model was used to analyze the comments sentiment (BERT), the comments stance was first annotated and then used to train different models SVM with TF-IDF, DistilBERT, LSTM and CNN for predicting the stance of unseen comments. The best results were achieved by the CNN, reaching 78.3% accuracy (92% after dataset normalization) on the test set. Whereas the most common stance identified in the comments is a neutral one (neither completely in favor nor completely against gender diversity), the overall sentiment of the discourse turns out to be negative. This shows that the discourse revolving around the topic of gender diversity in YouTube comments is filled with strong opinions, on the one hand, but also opens up a space for anonymously inquiring and learning about the topic and its implications, on the other. Our research thereby (1) contributes to the understanding and application of different NLP tasks used to predict the sentiment and stance of unstructured textual data, and (2) provides relevant insights into society s attitudes towards a changing system of values and beliefs. es_ES
dc.language Inglés es_ES
dc.publisher Universitat Politècnica de València es_ES
dc.relation.ispartof Journal of Computer-Assisted Linguistic Research es_ES
dc.rights Reconocimiento - No comercial - Sin obra derivada (by-nc-nd) es_ES
dc.subject Stance detection es_ES
dc.subject Sentiment analysis es_ES
dc.subject BERT es_ES
dc.subject Neural networks es_ES
dc.subject Annotation es_ES
dc.subject YouTube comments es_ES
dc.subject Gender diversity es_ES
dc.title Sentiment Analysis and Stance Detection on German YouTube Comments on Gender Diversity es_ES
dc.type Artículo es_ES
dc.identifier.doi 10.4995/jclr.2022.18224
dc.rights.accessRights Abierto es_ES
dc.description.bibliographicCitation Melnyk, L.; Feld, L. (2022). Sentiment Analysis and Stance Detection on German YouTube Comments on Gender Diversity. Journal of Computer-Assisted Linguistic Research. 6:59-86. https://doi.org/10.4995/jclr.2022.18224 es_ES
dc.description.accrualMethod OJS es_ES
dc.relation.publisherversion https://doi.org/10.4995/jclr.2022.18224 es_ES
dc.description.upvformatpinicio 59 es_ES
dc.description.upvformatpfin 86 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.description.volume 6 es_ES
dc.identifier.eissn 2530-9455
dc.relation.pasarela OJS\18224 es_ES
dc.description.references ALDayel, Abeer, and Walid Magdy. 2021. "Stance detection on social media: State of the art and trends." Information Processing and Management 58: 1-22. https://doi.org/10.1016/j.ipm.2021.102597 es_ES
dc.description.references Augenstein, Isabelle, Tim Rocktäschel, Andreas Vlachos, and Kalina Bontcheva. 2016. "Stance Detection with Bidirectional Conditional Encoding." In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, Texas, November 01-05. Association for Computational Linguistics. 876-885. https://doi.org/10.18653/v1/D16-1084 es_ES
dc.description.references Biber, Douglas, and Edward Finegan. 1988. "Adverbial stance types in English." Discourse Processes 11(1): 1-34. https://doi.org/10.1080/01638538809544689 es_ES
dc.description.references Birjali, Marouane, Mohammed Kasri, and Abderrahim Beni-Hssane. 2021. "A comprehensive survey on sentiment analysis: Approaches, challenges and trends." Knowledge-Based Systems 226(107134). https://doi.org/10.1016/j.knosys.2021.107134 es_ES
dc.description.references Brownlee, Jason. 2017. Long Short-Term Memory Networks With Python: Develop Sequence Prediction Models With Deep Learning. Machine Learning Mastery. Retrieved from https://machinelearningmastery.com/lstms-with-python/ es_ES
dc.description.references Chopra, Sahil, Saachi Jain, and John Merriman Sholar. 2017. "Towards Automatic Identification of Fake News: Headline-Article Stance Detection with LSTM Attention Models." CS224N project report, Stanford University. https://web.stanford.edu/class/archive/cs/cs224n/cs224n.1174/reports/2761028.pdf es_ES
dc.description.references Cieliebak, Mark, Jan Milaln Deriu, Dominic Egger, and Fatih Uzdilli. 2017. A Twitter Corpus and Benchmark Resources for German Sentiment Analysis. Association for Computational Linguistics. https://doi.org/10.18653/v1/W17-1106 es_ES
dc.description.references Devlin, Jacob, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding." In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, Minnesota, June 02-07. Association for Computational Linguistics. 4171-4186. https://doi.org/10.18653/v1/N19-1423 es_ES
dc.description.references Dey, Kuntal, Ritvik Shrivastava, and Saroj Kaushik. 2018. "Topical Stance Detection for Twitter: A Two-Phase LSTM Model Using Attention." Paper presented at 40th European Conference on IR Research 2018, Grenoble, France, March 26-29. doi:10.48550/arXiv.1801.03032 es_ES
dc.description.references Du Bois, John W. 2007. "The stance triangle." In Stancetaking in Discourse. Subjectivity, evaluation, interaction, edited by R. Englebretson, 139-182. Amsterdam/Philadelphia: John Benjamins Publishing Company. https://doi.org/10.1075/pbns.164.07du es_ES
dc.description.references Ezen-Can, Aysu. 2020. "A Comparison of LSTM and BERT for Small Corpus." Available online: https://arxiv.org/abs/2009.05451. es_ES
dc.description.references Go, Alec, Richa Bhayani, and Lei Huang. 2009. "Twitter Sentiment Classification using Distant Supervision." CS224N project report, Stanford University. https://www-cs.stanford.edu/people/alecmgo/papers/TwitterDistantSupervision09.pdf es_ES
dc.description.references Göhring, Anne, Manfred Klenner, and Sophia Conrad. 2021. "DeInStance: Creating and Evaluating a German Corpus for Fine-Grained Inferred Stance Detection." In Proceedings of the 17th Conference on Natural Language Processing, Düsseldorf, Germany, September 06-09. KONVENS 2021 Organizers. 213-217. http://aclanthology.org/2021.konvens-1.20/ es_ES
dc.description.references Goldberg, Yoav. 2016. "A Primer on Neural Network Models for Natural Language Processing." Journal of Artificial Intelligence Research 57(1): 35l-420. https://doi.org/10.1613/jair.4992 es_ES
dc.description.references Goldhahn, Dirk, Thomas Eckart, and Uwe Quasthoff. 2012. Building Large Monolingual Dictionaries at the Leipzig Corpora Collection: From 100 to 200 Languages. European Language Resources Association (ELRA). es_ES
dc.description.references Gonçalves, Pollyanna, Matheus Araújo, Fabrício Benevenuto, and Meeyoung Cha. 2014. "Comparing and Combining Sentiment Analysis Methods." In Proceedings of the first ACM conference on Online social networks, Boston, Massachusetts, October 07-08. New York: Association for Computing Machinery. 27-38. https://doi.org/10.1145/2512938.2512951 es_ES
dc.description.references Guhr, Oliver, Anne-Kathrin Schumann, Frank Bahrmann, and Hans-Joachim Böhme. 2020. "Training a Broad-Coverage German Sentiment Classification Model for Dialog Systems." In Proceedings of the 12th Conference on Language Resources and Evaluation, Marseille, France, May 11-16. European Language Resources Association. 1627-1632. http://aclanthology.org/2020.lrec-1.202/. es_ES
dc.description.references He, Haibo, and Edwardo A. Garcia. 2009. "Learning from Imbalanced Data." IEEE Transactions on Knowledge and Data Engineering 21(9): 1263 l-1284. https://doi.org/10.1109/TKDE.2008.239 es_ES
dc.description.references IBM a. "What are Recurrent Neural Networks?" September 14, 2020. https://www.ibm.com/cloud/learn/recurrent-neural-networks. es_ES
dc.description.references IBM b. "Convolutional Neural Networks." October 20, 2020. https://www.ibm.com/cloud/learn/convolutional-neural-networks. es_ES
dc.description.references Kim, Yoon. 2014. "Convolutional Neural Networks for Sentence Classification." In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, October 25-29. Association for Computational Linguistics. 1746-1751. https://doi.org/10.3115/v1/D14-1181 es_ES
dc.description.references Kingma, Diederik P., and Jimmy Lei Ba. 2015. "Adam: A method for stochastic optimization." Paper presented at the 3rd International Conference for Learning Representations, San Diego, California, May 07-09. http://arxiv.org/pdf/1412.6980.pdf. es_ES
dc.description.references Krejzl, Peter, Barbora Hourová, and Josef Steinberger. 2017. "Stance detection in online discussions." Work-in-progress paper. doi:10.48550/arXiv.1701.00504. es_ES
dc.description.references Landis, J. Richard, and Gary G. Koch. 1977. "The Measurement of Observer Agreement for Categorical Data." Biometrics 33(1): 159-174. https://doi.org/10.2307/2529310 es_ES
dc.description.references Medhat, Walaa, Ahmed Hassan, and Hoda Korashy. 2014. "Sentiment analysis algorithms and applications: A survey." Ain Shams Engineering Journal 5: 1093-1113. https://doi.org/10.1016/j.asej.2014.04.011 es_ES
dc.description.references Munaro, Ana Cristina, Renato Hübner Barcelos, Eliane Cristine Francisco Maffezzolli, João Pedro Santos Rodrigues, and Emerson Cabrera Paraiso. 2021. "To engage or not engage? The features of video content on YouTube affecting digital consumer engagement." Journal of Consumer Behaviour 20(5): 1336-1352. https://doi.org/10.1002/cb.1939 es_ES
dc.description.references Poddar, Lahari, Wynne Hsu, Mong Li Lee, and Shruti Subramaniyam. 2018. "Predicting Stances in Twitter Conversations for Detecting Veracity of Rumors: a Neural Approach." In Proceedings of the 2018 IEEE 30th International Conference on Tools with Artificial Intelligence, Volos, Greece, November 05-07. The Institute of Electrical and Electronics Engineers, Inc. 65-72. https://doi.org/10.1109/ICTAI.2018.00021 es_ES
dc.description.references Prati, Ronaldo C., Gustavo E.A.P.A. Batista, and Maria C. Monard. 2004. "Class imbalances versus class overlapping: an analysis of a learning system behavior." In MICAI 2004: Advances in Artificial Intelligence, Third Mexian International Conference on Artificial Intelligence, Mexico City, Mexico, April 26-30, 2004, 312-321. Berlin/Heidelberg: Springer. https://doi.org/10.1007/978-3-540-24694-7_32 es_ES
dc.description.references Saif, Hassan, Yulan He, and Harith Alani. 2012. "Semantic Sentiment Analysis of Twitter." In The Semantic Web - ISWC 2012. Proceedings, Part I, Boston, Massachusetts, USA, November 11-15. Berlin/Heidelberg: Springer. 508-524. https://doi.org/10.1007/978-3-642-35176-1_32 es_ES
dc.description.references Sänger, Mario, Ulf Leser, Steffen Kemmerer, Peter Adolphs, and Roman Klinger. 2016. SCARE - The Sentiment Corpus of App Reviews with Fine-grained Annotations in German. European Language Resources Association (ELRA). es_ES
dc.description.references Sanh, Victor, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2020. "DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter." Paper presented at The 5th Workshop on Energy Efficient Machine Learning and Cognitive Computing, Co-located with the 33rd Conference on Neural Information Processing Systems 2019, Vancouver, British Columbia, December 13. doi:10.48550/arXiv.1910.01108 es_ES
dc.description.references Sarlan, Aliza, Chayanit Nadam, and Shuib Basri. 2014. Twitter Sentiment Analysis. In 2014 International Conference on Information Technology and Multimedia, Putrajaya, Malaysia, November 18-20. IEEE. 212-216. https://doi.org/10.1109/ICIMU.2014.7066632 es_ES
dc.description.references Sidarenka, Uladzimir. 2016. PotTS: The Potsdam Twitter Sentiment Corpus. European Language Resources Association (ELRA). es_ES
dc.description.references Suthaharan, Shan. 2016. Machine Learning Models and Algorithms for Big Data Classification. New York: Springer. https://doi.org/10.1007/978-1-4899-7641-3 es_ES
dc.description.references Taher, SM Abu, Kazi Afsana Akhter, and K.M. Azharul Hasan. 2018. "N-gram Based Sentiment Mining for Bangla Text Using Support Vector Machine." In 2018 International Conference on Bangla Speech and Language Processing, Sylhet, Bangladesh, September 21-22. IEEE. 70-75. https://doi.org/10.1109/ICBSLP.2018.8554716 es_ES
dc.description.references Wojatzki, Michael, Eugen Ruppert, Sarah Holschneider, Torsten Zesch, and Chris Biemann. 2017. "GermEval 2017: Shared Task on Aspect-based Sentiment in Social Media Customer Feedback." In Proceedings of the GermEval 2017, Berlin, Germany, September 12. GSCL. 1-12. doi:10.17185/duepublico/72074 es_ES
dc.description.references Yusof, Nor Nadiah, Azlinah Mohamed, and Shuzlina Abdul-Rahman. 2015. "Reviewing Classification Approaches in Sentiment Analysis." In Soft Computing in Data Science. First Interntational Conference 2015. Proceedings, Putrajaya, Malaysia, September 02-03. Singapore: Springer. 43-53. https://doi.org/10.1007/978-981-287-936-3_5 es_ES


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem