[EN] In this paper, we describe a hybrid approach for word-level language (WLL) identification of Bangla words written in Roman script and mixed with English words as part of our participation in the shared task on ...
[EN] Question classification (QC) is a prime constituent of an automated question answering system. The work presented here demonstrates that a combination of multiple models achieves better classification performance than ...
[EN] Before the advent of the Internet era, code-mixing was mainly used in the spoken form. However, with the recent popular informal networking platforms such as Facebook, Twitter, Instagram, etc., in social media, ...
Frenda, Simona; Banerjee, Somnath; Rosso, Paolo; Patti, Viviana(Instituto Politecnico Nacional/Centro de Investigacion en Computacion, 2020)
[EN] In the last years, the control of online user generated content is becoming a priority, because of the increase of online aggressiveness and hate speech legal cases. Considering the complexity and the importance of ...
[EN] India is a nation of geographical and cultural diversity where over 1600 dialects are spoken by the people. With the technological advancement, penetration of the internet and cheaper access to mobile data, India has ...
Banerjee, Somnath(Universitat Politècnica de València, 2018-10-24)
Solo unos pocos trabajos de investigación se han llevado a cabo sobre sistemas de búsqueda de respuestas (BR) en idiomas de la India. No hay un sistema de BR para bengalí.
El objetivo principal de este TFM es el desarrollo ...