A Low Dimensionality Representation for Language Variety Identification

Rangel-Pardo, Francisco Manuel; Franco-Salvador, Marc; Rosso, Paolo

doi:10.1007/978-3-319-75487-1_13

Identificarse

Buscar en RiuNet

Listar

Todo RiuNet
Esta colección

Mi cuenta

Acceder

Estadísticas

Ver Estadísticas de uso

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

A Low Dimensionality Representation for Language Variety Identification

Mostrar el registro completo del ítem

Rangel-Pardo, FM.; Franco-Salvador, M.; Rosso, P. (2018). A Low Dimensionality Representation for Language Variety Identification. Lecture Notes in Computer Science. 9624:156-169. https://doi.org/10.1007/978-3-319-75487-1_13

Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10251/146184

Ficheros en el ítem

Nombre: Rangel-Pardo;Fran ...

Tamaño: 358.4Kb

Formato: PDF

Descripción: Versión del Autor.

Abrir/Preview

Nombre: Rangel-et-al-LNCS.pdf

Tamaño: 682.1Kb

Formato: PDF

Descripción: Versión editorial

Solicitar una copia al autor

Metadatos del ítem

Título:

A Low Dimensionality Representation for Language Variety Identification

Autor:

Rangel-Pardo, Francisco Manuel Franco-Salvador, Marc

Rosso, Paolo

Entidad UPV:

Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació

Fecha difusión:

2018

Resumen:

[EN] Language variety identification aims at labelling texts in a native language (e.g. Spanish, Portuguese, English) with its specific variation (e.g. Argentina, Chile, Mexico, Peru, Spain; Brazil, Portugal; UK, US). In ...[+]

Palabras clave:

Low dimensionality representation , Language variety identification , Similar languages discrimination , Author profiling , Big data , Social media

Derechos de uso:

Reserva de todos los derechos

Fuente:

Lecture Notes in Computer Science. (issn: 0302-9743 )

DOI:

10.1007/978-3-319-75487-1_13

Editorial:

Springer-Verlag

Versión del editor:

https://doi.org/10.1007/978-3-319-75487-1_13

Título del congreso:

17th International Conference on Intelligent Text Processing and Computational Linguistics (CICLing 2016)

Lugar del congreso:

Konya, Turquía

Fecha congreso:

Abril 03-09,2016

Código del Proyecto:

info:eu-repo/grantAgreement/GVA//PROMETEOII%2F2014%2F030/ES/ Adaptive learning and multimodality in machine translation and text transcription/
info:eu-repo/grantAgreement/MINECO//TIN2015-71147-C2-1-P/ES/COMPRENSION DEL LENGUAJE EN LOS MEDIOS DE COMUNICACION SOCIAL - REPRESENTANDO CONTEXTOS DE FORMA CONTINUA/

Agradecimientos:

The work of the first author was in the framework of ECOPORTUNITY IPT-2012-1220-430000. The work of the last two authors was in the framework of the SomEMBED MINECO TIN2015-71147-C2-1-P research project. This work has been ...[+]

Tipo:

Artículo Comunicación en congreso Capítulo de libro

References

Franco-Salvador, M., Rangel, F., Rosso, P., Taulé, M., Antònia Martít, M.: Language variety identification using distributed representations of words and documents. In: Mothe, J., Savoy, J., Kamps, J., Pinel-Sauvagnat, K., Jones, G.J.F., SanJuan, E., Cappellato, L., Ferro, N. (eds.) CLEF 2015. LNCS, vol. 9283, pp. 28–40. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24027-5_3

Goodman, J.: Classes for fast maximum entropy training. In: Proceedings of the Acoustics, Speech, and Signal Processing (ICASSP 2001), vol. 1, pp. 561–564 (2001)

Gutmann, M.U., Hyvärinen, A.: Noise-contrastive estimation of unnormalized statistical models, with applications to natural image statistics. J. Mach. Learn. Res. 13, 307–361 (2012)

Hinton, G.E., Mcclelland, J.L., Rumelhart, D.E.: Distributed Representations, Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Foundations, vol. 1. MIT Press, Cambridge (1986)

Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. In: Proceedings of the 31st International Conference on Machine Learning (ICML 2014), vol. 32 (2014)

Maier, W., Gómez-Rodríguez, C.: Language variety identification in Spanish tweets. In: Workshop on Language Technology for Closely Related Languages and Language Variants (EMNLP 2014), pp. 25–35 (2014)

Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of Workshop at International Conference on Learning Representations (ICLR 2013) (2013)

Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)

Mnih, A., Teh, Y.W.: A fast and simple algorithm for training neural probabilistic language models. In: Proceedings of the 29th International Conference on Machine Learning (ICML 2012), pp. 1751–1758 (2012)

Sadat, F., Kazemi, F., Farzindar, A.: Automatic identification of Arabic language varieties and dialects in social media. In: 1st International Workshop on Social Media Retrieval and Analysis (SoMeRa 2014) (2014)

Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manag. 24(5), 513–523 (1988)

Tan, L., Zampieri, M., Ljubešic, N., Tiedemann, J.: Merging comparable data sources for the discrimination of similar languages: the DSL corpus collection. In: 7th Workshop on Building and Using Comparable Corpora Building Resources for Machine Translation Research (BUCC 2014), pp. 6–10 (2014)

Zampieri, M., Gebrekidan-Gebre, B.: Automatic identification of language varieties: the case of Portuguese. In: Proceedings of the 11th Conference on Natural Language Processing (KONVENS 2012), pp. 233–237 (2012)

Zampieri, M., Tan, L., Ljubeši, N., Tiedemann, J.: A report on the DSL shared task 2014. In: Proceedings of the First Workshop on Applying NLP Tools to Similar Languages, Varieties and Dialects (VarDial 2014), pp. 58–67 (2014)

[-]

recommendations

Este ítem aparece en la(s) siguiente(s) colección(ones)

Artículos, conferencias, monografías [48344]

Mostrar el registro completo del ítem

A Low Dimensionality Representation for Language Variety Identification

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Buscar en RiuNet

Listar

Todo RiuNet

Esta colección

Mi cuenta

Estadísticas

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

A Low Dimensionality Representation for Language Variety Identification

Ficheros en el ítem

Metadatos del ítem

References

recommendations

Este ítem aparece en la(s) siguiente(s) colección(ones)