Mostrar el registro sencillo del ítem
dc.contributor.author | Franco Salvador, Marc | es_ES |
dc.contributor.author | Rangel, Francisco | es_ES |
dc.contributor.author | Rosso, Paolo | es_ES |
dc.contributor.author | Taulé, Mariona | es_ES |
dc.contributor.author | Martí, M. Antònia | es_ES |
dc.date.accessioned | 2016-05-19T09:41:48Z | |
dc.date.available | 2016-05-19T09:41:48Z | |
dc.date.issued | 2015-11-20 | |
dc.identifier.isbn | 978-3-319-24026-8 | |
dc.identifier.issn | 0302-9743 | |
dc.identifier.uri | http://hdl.handle.net/10251/64372 | |
dc.description | The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-24027-5_3 | es_ES |
dc.description.abstract | In this work we focus on the use of distributed representations of words and documents using the continuous Skip-gram model. We compare this model with three recent approaches: Information Gain Word-Patterns, TF-IDF graphs and Emotion-labeled Graphs, in addition to several baselines. We evaluate the models introducing the Hispablogs dataset, a new collection of Spanish blogs from five different countries: Argentina, Chile, Mexico, Peru and Spain. Experimental results show state-of-the-art performance in language variety identification. | es_ES |
dc.description.sponsorship | This research has been carried out within the framework of the European Commis-sion WIQ-EI IRSES (no. 269180) and DIANA - Finding Hidden Knowledge in Texts (TIN2012-38603-C02) projects. The work of the second author was partially funded by Autoritas Consulting SA and by Spanish the Ministry of Economics by means of a ECOPORTUNITY IPT-2012-1220-430000 grant. | es_ES |
dc.language | Inglés | es_ES |
dc.publisher | Springer International Publishing | es_ES |
dc.relation.ispartof | Experimental IR Meets Multilinguality, Multimodality, and Interaction: 6th International Conference of the CLEF Association, CLEF'15, Toulouse, France, September 8-11, 2015, Proceedings | es_ES |
dc.relation.ispartofseries | Lecture Notes in Computer Science;9283 | |
dc.rights | Reserva de todos los derechos | es_ES |
dc.subject | Author profiling | es_ES |
dc.subject | Language variety identification | es_ES |
dc.subject | Distributed representations | es_ES |
dc.subject | Information Gain Word-Patterns | es_ES |
dc.subject | TF-IDF graphs | es_ES |
dc.subject | Emotion-labeled Graphs | es_ES |
dc.subject.classification | LENGUAJES Y SISTEMAS INFORMATICOS | es_ES |
dc.title | Language variety identification using distributed representations of words and documents | es_ES |
dc.type | Capítulo de libro | es_ES |
dc.identifier.doi | 10.1007/978-3-319-24027-5_3 | |
dc.relation.projectID | info:eu-repo/grantAgreement/MINECO//TIN2012-38603-C02/ | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/EC/FP7/269180/EU/Web Information Quality Evaluation Initiative/ | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/MINECO//IPT-2012-1220-430000/ES/ECOPORTUNITY/ | es_ES |
dc.rights.accessRights | Abierto | es_ES |
dc.contributor.affiliation | Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació | es_ES |
dc.description.bibliographicCitation | Franco Salvador, M.; Rangel, F.; Rosso, P.; Taulé, M.; Martí, MA. (2015). Language variety identification using distributed representations of words and documents. En Experimental IR Meets Multilinguality, Multimodality, and Interaction: 6th International Conference of the CLEF Association, CLEF'15, Toulouse, France, September 8-11, 2015, Proceedings. Springer International Publishing. 28-40. https://doi.org/10.1007/978-3-319-24027-5_3 | es_ES |
dc.description.accrualMethod | S | es_ES |
dc.relation.publisherversion | http://link.springer.com/chapter/10.1007/978-3-319-24027-5_3 | es_ES |
dc.description.upvformatpinicio | 28 | es_ES |
dc.description.upvformatpfin | 40 | es_ES |
dc.type.version | info:eu-repo/semantics/publishedVersion | es_ES |
dc.relation.senia | 303139 | es_ES |
dc.contributor.funder | European Commission | es_ES |
dc.contributor.funder | Ministerio de Educación y Ciencia | es_ES |
dc.contributor.funder | Autoritas Consulting, S.A. | es_ES |
dc.contributor.funder | Ministerio de Economía y Competitividad | es_ES |
dc.description.references | Barto, A.G.: Reinforcement learning: An introduction. MIT press (1998) | es_ES |
dc.description.references | Bengio, Y., Ducharme, R., Vincent, P., Janvin, C.: A neural probabilistic language model. The Journal of Machine Learning Research 3, 1137–1155 (2003) | es_ES |
dc.description.references | Dumais, S.T.: Latent semantic analysis. Annual Review of Information Science and Technology 38(1), 188–230 (2004) | es_ES |
dc.description.references | Gutmann, M.U., Hyvärinen, A.: Noise-contrastive estimation of unnormalized statistical models, with applications to natural image statistics. The Journal of Machine Learning Research 13(1), 307–361 (2012) | es_ES |
dc.description.references | Hinton, G.E., McClelland, J.L., Rumelhart, D.E.: Distributed representations. In: Rumelhart, D.E., McClelland, J.L., (eds.) Parallel Distributed Processing: Explorations in the Microstructure of Cognition. MIT Press (1986) | es_ES |
dc.description.references | Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the International Conference on Empirical Methods in Natural Language Processing (2014) | es_ES |
dc.description.references | Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. In: Proceedings of the 31st International Conference on Machine Learning (2014) | es_ES |
dc.description.references | Levin, B.: English verb classes and alternations. University of Chicago Press, Chicago (1993) | es_ES |
dc.description.references | Maier, W., Gómez-Rodríguez, C.: Language variety identification in Spanish tweets. In: Proceedings of the EMNLP’2014 Workshop on Language Technology for Closely Related Languages and Language Variants, pp. 25–35. Association for Computational Linguistics, Doha, Qatar, October 2014. http://emnlp2014.org/workshops/LT4CloseLang/call.html | es_ES |
dc.description.references | Martí, M.A., Bertran, M., Taulé, M., Salamó, M.: Distributional approach based on syntactic dependencies for discovering constructions. Computational Linguistics (2015, under review) | es_ES |
dc.description.references | Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of Workshop at International Conference on Learning Representations (2013) | es_ES |
dc.description.references | Mikolov, T., Karafiát, M., Burget, L., Cernockỳ, J., Khudanpur, S.: Recurrent neural network based language model. In: INTERSPEECH 2010, 11th Annual Conference of the International Speech Communication Association, Makuhari, Chiba, Japan, pp. 1045–1048, September 26–30, 2010 | es_ES |
dc.description.references | Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, vol. 26, pp. 3111–3119 (2013) | es_ES |
dc.description.references | Mnih, A., Teh, Y.W.: A fast and simple algorithm for training neural probabilistic language models. arXiv preprint arXiv:1206.6426 (2012) | es_ES |
dc.description.references | Mohammad, S.M., Yang, T.: Tracking sentiment in mail: how gender differ on emotional axes. In: Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis (2011) | es_ES |
dc.description.references | Morin, F., Bengio, Y.: Hierarchical probabilistic neural network language model. In: Proceedings of the International Workshop on Artificial Intelligence and Statistics, pp. 246–252. Citeseer (2005) | es_ES |
dc.description.references | Pennebaker, J.W.: The secret life of pronouns: What our words say about us. Bloomsbury Press (2011) | es_ES |
dc.description.references | Rangel, F., Rosso, P.: On the impact of emotions on author profiling. Information Processing & Management, Special Issue on Emotion and Sentiment in Social and Expressive Media (2015, in press) | es_ES |
dc.description.references | Rangel, F., Rosso, P., Chugur, I., Potthast, M., Trenkmann, M., Stein, B., Verhoeven, B., Daelemans, W.: Overview of the 2nd author profiling task at pan 2014. In: Cappellato, L., Ferro, N., Halvey, M., Kraaij, W. (eds.) CLEF 2014 Labs and Workshops, Notebook Papers. CEUR-WS.org, vol. 1180 (2014) | es_ES |
dc.description.references | Rangel, F., Rosso, P., Koppel, M., Stamatatos, E., Inches, G.: Overview of the author profiling task at pan 2013. In: Forner P., Navigli R., Tufis, D. (eds.) Notebook Papers of CLEF 2013 LABs and Workshops. CEUR-WS.org, vol. 1179 (2013) | es_ES |
dc.description.references | Sadat, F., Kazemi, F., Farzindar, A.: Automatic identification of arabic language varieties and dialects in social media. In: Proceeding of the 1st International Workshop on Social Media Retrieval and Analysis SoMeRa (2014) | es_ES |
dc.description.references | Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Communications of the ACM 18(11), 613–620 (1975) | es_ES |
dc.description.references | Sidorov, G., Miranda-Jimnez, S., Viveros-Jimnez, F., Gelbukh, F., Castro-Snchez, N., Velsquez, F., Daz-Rangel, I., Surez-Guerra, S., Trevio, A., Gordon-Miranda, J.: Empirical study of opinion mining in spanish tweets. In: 11th Mexican International Conference on Artificial Intelligence, MICAI, pp. 1–4 (2012) | es_ES |
dc.description.references | Zampieri, M., Gebrekidan-Gebre, B.: Automatic identification of language varieties: the case of portuguese. In: Proceedings of the Conference on Natural Language Processing (2012) | es_ES |