Combining Embeddings of Input Data for Text Classification

Parcheta, Zuzanna; Sanchis Trilles, Germán; Casacuberta Nolla, Francisco; Rendahl, Robin

doi:10.1007/s11063-020-10312-w

Identificarse

Buscar en RiuNet

Listar

Todo RiuNet
Esta colección

Mi cuenta

Acceder

Estadísticas

Ver Estadísticas de uso

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Combining Embeddings of Input Data for Text Classification

Mostrar el registro sencillo del ítem

Ficheros en el ítem

Nombre: ParchetaSanchisCa ...

Tamaño: 565.7Kb

Formato: PDF

Descripción: Versión del Autor.

Abrir

Nombre: 2021-NPL-Parchert ...

Tamaño: 472.9Kb

Formato: PDF

Descripción: Versión editorial

Solicitar una copia al autor

dc.contributor.author	Parcheta, Zuzanna	es_ES
dc.contributor.author	Sanchis Trilles, Germán	es_ES
dc.contributor.author	Casacuberta Nolla, Francisco	es_ES
dc.contributor.author	Rendahl, Robin	es_ES
dc.date.accessioned	2022-06-16T18:05:48Z
dc.date.available	2022-06-16T18:05:48Z
dc.date.issued	2021-10	es_ES
dc.identifier.issn	1370-4621	es_ES
dc.identifier.uri	http://hdl.handle.net/10251/183409
dc.description.abstract	[EN] The problem of automatic text classification is an essential part of text analysis. The improvement of text classification can be done at different levels such as a preprocessing step, network implementation, etc. In this paper, we focus on how the combination of different methods of text encoding may affect classification accuracy. To do this, we implemented a multi-input neural network that is able to encode input text using several text encoding techniques such as BERT, neural embedding layer, GloVe, skip-thoughts and ParagraphVector. The text can be represented at different levels of tokenised input text such as the sentence level, word level, byte pair encoding level and character level. Experiments were conducted on seven datasets from different language families: English, German, Swedish and Czech. Some of those languages contain agglutinations and grammatical cases. Two out of seven datasets originated from real commercial scenarios: (1) classifying ingredients into their corresponding classes by means of a corpus provided by Northfork; and (2) classifying texts according to the English level of their corresponding writers by means of a corpus provided by ProvenWord. The developed architecture achieves an improvement with different combinations of text encoding techniques depending on the different characteristics of the datasets. Once the best combination of embeddings at different levels was determined, different architectures of multi-input neural networks were compared. The results obtained with the best embedding combination and best neural network architecture were compared with state-of-the-art approaches. The results obtained with the dataset used in the experiments were better than the state-of-the-art baselines.	es_ES
dc.description.sponsorship	This work is partially supported by MINECO under Grant DI-15-08169 and by Sciling under its R+D program. The authors would like to thank NVIDIA for their donation of a Titan Xp GPU, which allowed us to conduct this research	es_ES
dc.language	Inglés	es_ES
dc.publisher	Springer-Verlag	es_ES
dc.relation.ispartof	Neural Processing Letters	es_ES
dc.rights	Reserva de todos los derechos	es_ES
dc.subject	Text classification	es_ES
dc.subject	Multi-input network	es_ES
dc.subject	Agglutinative language	es_ES
dc.subject	Inflected language	es_ES
dc.subject	Embedding combination	es_ES
dc.subject.classification	LENGUAJES Y SISTEMAS INFORMATICOS	es_ES
dc.title	Combining Embeddings of Input Data for Text Classification	es_ES
dc.type	Artículo	es_ES
dc.identifier.doi	10.1007/s11063-020-10312-w	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/MINECO//DI-15-08169/ES/DI-15-08169/	es_ES
dc.rights.accessRights	Abierto	es_ES
dc.contributor.affiliation	Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació	es_ES
dc.description.bibliographicCitation	Parcheta, Z.; Sanchis Trilles, G.; Casacuberta Nolla, F.; Rendahl, R. (2021). Combining Embeddings of Input Data for Text Classification. Neural Processing Letters. 53(5):3123-3151. https://doi.org/10.1007/s11063-020-10312-w	es_ES
dc.description.accrualMethod	S	es_ES
dc.relation.publisherversion	https://doi.org/10.1007/s11063-020-10312-w	es_ES
dc.description.upvformatpinicio	3123	es_ES
dc.description.upvformatpfin	3151	es_ES
dc.type.version	info:eu-repo/semantics/publishedVersion	es_ES
dc.description.volume	53	es_ES
dc.description.issue	5	es_ES
dc.relation.pasarela	S\417680	es_ES
dc.contributor.funder	Ministerio de Economía y Competitividad	es_ES
dc.description.references	Abadi M, Barham P, Chen J, Chen Z et al (2016) Tensorflow: a system for large-scale machine learning. In: Proceedings of 12th USENIX symposium on operating systems design and implementation (OSDI 16), pp 265–283	es_ES
dc.description.references	Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: Proceedings of workshop at the international conference on learning representations (ICLR)	es_ES
dc.description.references	Bergsma S, Kondrak G (2007) Alignment-based discriminative string similarity. In: Proceedings of the 45th annual meeting of the association of computational linguistics, pp 656–663	es_ES
dc.description.references	Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135–146	es_ES
dc.description.references	Bridle JS (1989) Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. In: Fogelman Soulié F, Hérault J (eds) Neurocomputing. Springer, Berlin, Heidelberg, pp 227–236	es_ES
dc.description.references	Chen D, Manning CD (2014) A fast and accurate dependency parser using neural networks. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 740–750	es_ES
dc.description.references	Chollet F (2016) Using pre-trained word embeddings in a keras model. The Keras Blog, London	es_ES
dc.description.references	Chollet F, Falbel D, Allaire J, Tang YT, Van Der Bijl W, Studer M, Keydana S (2015) Keras: deep learning library for theano and tensorflow, vols 7, 8. https://keras.io/k	es_ES
dc.description.references	Conneau A, Kiela D, Schwenk H, Barrault L, Bordes A (2017) Supervised learning of universal sentence representations from natural language inference data. In: Proceedings of the 2017 conference on empirical methods in natural language processing, pp 670–680	es_ES
dc.description.references	Dai AM, Olah C, Le QV (2015) Document embedding with paragraph vectors. Preprint arXiv:1507.07998v1	es_ES
dc.description.references	Devlin J, Chang M, Lee K, Toutanova K (2018) BERT: pre-training of deep bidirectional transformers for language understanding. Preprint arXiv:1810.04805	es_ES
dc.description.references	Gage P (1994) A new algorithm for data compression. C Users J 12:23–38	es_ES
dc.description.references	Goasduff L, Omale G (2018) Gartner survey finds consumers would use AI to save time and money. Gartner, Berlin	es_ES
dc.description.references	Gupta V, Karnick H, Bansal A, Jhala P (2016) Product classification in e-commerce using distributional semantics. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pp 536–546	es_ES
dc.description.references	Habernal I, Brychcín T (2013) Unsupervised improving of sentiment analysis using global target context. Proc Recent Adv Nat Lang Process 2013:122–128	es_ES
dc.description.references	Hill F, Cho K, Korhonen A (2016) Learning distributed representations of sentences from unlabelled data. In:Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp 1367–1377	es_ES
dc.description.references	Hövelmann L, Allee S, Friedrich CM (2017) Fasttext and gradient boosted trees at Germeval-2017 on relevance classification and document-level polarity. In: Shared task on aspect-based sentiment in social media customer feedback, pp 30–35	es_ES
dc.description.references	Ionescu RT, Butnaru A (2019) Vector of locally-aggregated word embeddings (VLAWE): a novel document-level representation. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers), pp 363–369	es_ES
dc.description.references	Jain G, Sharma M, Agarwal B (2019) Spam detection in social media using convolutional and long short term memory neural network. Ann Math Artif Intell 85(1):21–44	es_ES
dc.description.references	Jauhiainen TS, Lui M, Zampieri M, Baldwin T, Lindén K (2019) Automatic language identification in texts: a survey. J Artif Intell Res 65:675–782	es_ES
dc.description.references	Joulin A, Grave E, Bojanowski P, Mikolov T (2017) Bag of tricks for efficient text classification. In: Proceedings of conference of the European chapter of the association for computational linguistics (ACL), vol 2, pp 427–431	es_ES
dc.description.references	Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. Preprint arXiv:1412.6980	es_ES
dc.description.references	Kiros R, Zhu Y, Salakhutdinov R, Zemel RS, Torralba A, Urtasun R, Fidler S (2015) Skip-thought vectors. Preprint arXiv:1506.06726	es_ES
dc.description.references	Koehn P (2004) Statistical significance tests for machine translation evaluation. In: Proceedings of the 2004 conference on empirical methods on natural language processing, pp 388–395	es_ES
dc.description.references	Parcheta Z, Sanchis-Trilles G, Casacuberta F, Redahl R (2019) Multi-input CNN for text classification in commercial scenarios. In: Proceedings of the international work-conference on artificial neural networks. Springer, Berlin, pp 596–608	es_ES
dc.description.references	Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings of the conference on empirical methods in natural language processing (EMNLP), pp 1532–1543	es_ES
dc.description.references	Sadr H, Pedram MM, Teshnehlab M (2019) A robust sentiment analysis method based on sequential combination of convolutional and recursive neural networks. Neural Process Lett 50(3):2745–2761	es_ES
dc.description.references	Sayyed ZA, Dakota D, Kübler S (2017) IDS IUCL: investigating feature selection and oversampling for GermEval2017. Shared task on aspect-based sentiment in social media customer feedback, pp 43–48	es_ES
dc.description.references	Sennrich R, Haddow B, Birch A (2016) Neural machine translation of rare words with subword units. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies (NAACL HLT), vol 1, pp 1715–1725	es_ES
dc.description.references	Socher R, Perelygin A, Wu J, Chuang J, Manning CD, Ng AY, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 conference on empirical methods in natural language processing, pp 1631–1642	es_ES
dc.description.references	Stein RA, Jaques PA, Valiati JF (2018) An analysis of hierarchical text classification using word embeddings. Preprint arXiv:1809.01771	es_ES
dc.description.references	Strange W, Bohn OS, Nishi K, Trent SA (2005) Contextual variation in the acoustic and perceptual similarity of North German and American English vowels. J Acoust Soc Am 118(3):1751–1762	es_ES
dc.description.references	Strange W, Bohn OS, Trent SA, Nishi K (2004) Acoustic and perceptual similarity of North German and American English vowels. J Acoust Soc Am 115(4):1791–1807	es_ES
dc.description.references	Tiwary A (2017) Time is money and artificial intelligence can save you time. Digital CMO, London	es_ES
dc.description.references	Vaswani A, Bengio S, Brevdo E, Chollet F, Gomez AN, Gouws S, Jones L, Kaiser L, Kalchbrenner N, Parmar N, Sepassi R, Shazeer N, Uszkoreit J (2018) Tensor2tensor for neural machine translation. Preprint arXiv:1803.07416	es_ES
dc.description.references	Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008	es_ES
dc.description.references	Wojatzki M, Ruppert E, Holschneider S, Zesch T, Biemann C (2017) Germeval 2017: shared task on aspect-based sentiment in social media customer feedback. In: Shared task on aspect-based sentiment in social media customer feedback, pp 1–12	es_ES
dc.description.references	Xu B, Wang N, Chen T, Li M (2015) Empirical evaluation of rectified activations in convolutional network. Preprint arXiv:1505.00853	es_ES
dc.description.references	Xu J, Zhang C, Zhang P, Song D (2018) Text classification with enriched word features. In: Proceedings of the 16th Pacific RIM international conference on artificial intelligence (PRICAI). Springer, Berlin, pp 274–281	es_ES
dc.description.references	Zhang L, Wang S, Liu B (2018) Deep learning for sentiment analysis: a survey. Wiley Interdiscip Rev Data Min Knowl Discov 8(4):e1253	es_ES
dc.description.references	Zhang X, LeCun Y (2015) Text understanding from scratch. Preprint arXiv:1502.01710	es_ES

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem

Combining Embeddings of Input Data for Text Classification

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Buscar en RiuNet

Listar

Todo RiuNet

Esta colección

Mi cuenta

Estadísticas

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Combining Embeddings of Input Data for Text Classification

Ficheros en el ítem

Este ítem aparece en la(s) siguiente(s) colección(ones)