Word graphs size impact on the performance of handwriting document applications

Toselli ., Alejandro Héctor; Romero Gómez, Verónica; Vidal, Enrique

doi:10.1007/s00521-016-2336-2

Identificarse

Buscar en RiuNet

Listar

Todo RiuNet
Esta colección

Mi cuenta

Acceder

Estadísticas

Ver Estadísticas de uso

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Word graphs size impact on the performance of handwriting document applications

Mostrar el registro sencillo del ítem

Ficheros en el ítem

Nombre: NCAA-S-15-01579.pdf

Tamaño: 1.075Mb

Formato: PDF

Descripción: Versión del Autor.

Abrir

Nombre: art%3A10.1007%2Fs ...

Tamaño: 1.108Mb

Formato: PDF

Descripción: Versión editorial

Solicitar una copia al autor

dc.contributor.author	Toselli ., Alejandro Héctor	es_ES
dc.contributor.author	Romero Gómez, Verónica	es_ES
dc.contributor.author	Vidal, Enrique	es_ES
dc.date.accessioned	2018-05-18T07:31:06Z
dc.date.available	2018-05-18T07:31:06Z
dc.date.issued	2017	es_ES
dc.identifier.issn	0941-0643	es_ES
dc.identifier.uri	http://hdl.handle.net/10251/102206
dc.description.abstract	[EN] Two document processing applications are con- sidered: computer-assisted transcription of text images (CATTI) and Keyword Spotting (KWS), for transcribing and indexing handwritten documents, respectively. Instead of working directly on the handwriting images, both of them employ meta-data structures called word graphs (WG), which are obtained using segmentation-free hand- written text recognition technology based on N-gram lan- guage models and hidden Markov models. A WG contains most of the relevant information of the original text (line) image required by CATTI and KWS but, if it is too large, the computational cost of generating and using it can become unafordable. Conversely, if it is too small, relevant information may be lost, leading to a reduction of CATTI or KWS performance. We study the trade-off between WG size and performance in terms of effectiveness and effi- ciency of CATTI and KWS. Results show that small, computationally cheap WGs can be used without loosing the excellent CATTI and KWS performance achieved with huge WGs.	es_ES
dc.description.sponsorship	Work partially supported by the Generalitat Valenciana under the Prometeo/2009/014 Project Grant ALMAMATER, by the Spanish MECD as part of the Valorization and I+D+I Resources program of VLC/CAMPUS in the International Excellence Campus program, and through the EU projects: HIMANIS (JPICH programme, Spanish Grant Ref. PCIN-2015-068) and READ (Horizon-2020 programme, Grant Ref. 674943).	en_EN
dc.language	Inglés	es_ES
dc.publisher	Springer-Verlag	es_ES
dc.relation.ispartof	Neural Computing and Applications	es_ES
dc.rights	Reserva de todos los derechos	es_ES
dc.subject	Computer-assisted transcription of text images	es_ES
dc.subject	Keyword spotting for handwritten text	es_ES
dc.subject	Historical handwritten manuscripts	es_ES
dc.subject	Word graphs	es_ES
dc.subject	Evaluation performance	es_ES
dc.subject.classification	ESTADISTICA E INVESTIGACION OPERATIVA	es_ES
dc.subject.classification	LENGUAJES Y SISTEMAS INFORMATICOS	es_ES
dc.title	Word graphs size impact on the performance of handwriting document applications	es_ES
dc.type	Artículo	es_ES
dc.identifier.doi	10.1007/s00521-016-2336-2	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/EC/H2020/674943/EU/Recognition and Enrichment of Archival Documents/	en_EN
dc.relation.projectID	info:eu-repo/grantAgreement/Generalitat Valenciana//PROMETEO09%2F2009%2F014/ES/Adaptive learning and multimodality in pattern recognition (Almapater)/	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/MINECO//PCIN-2015-068/ES/INDEXACION DE MANUSCRITOS HISTORICOS PARA BUSQUEDAS CONTROLADAS POR EL USUARIO/	es_ES
dc.rights.accessRights	Abierto	es_ES
dc.date.embargoEndDate	2018-09-01	es_ES
dc.contributor.affiliation	Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació	es_ES
dc.description.bibliographicCitation	Toselli ., AH.; Romero Gómez, V.; Vidal, E. (2017). Word graphs size impact on the performance of handwriting document applications. Neural Computing and Applications. 28(9):2477-2487. https://doi.org/10.1007/s00521-016-2336-2	es_ES
dc.description.accrualMethod	S	es_ES
dc.relation.publisherversion	https://doi.org/10.1007/s00521-016-2336-2	es_ES
dc.description.upvformatpinicio	2477	es_ES
dc.description.upvformatpfin	2487	es_ES
dc.type.version	info:eu-repo/semantics/publishedVersion	es_ES
dc.description.volume	28	es_ES
dc.description.issue	9	es_ES
dc.relation.pasarela	S\338506	es_ES
dc.contributor.funder	Generalitat Valenciana	es_ES
dc.contributor.funder	European Commission	es_ES
dc.contributor.funder	Ministerio de Economía, Industria y Competitividad	es_ES
dc.description.references	Amengual JC, Vidal E (1998) Efficient error-correcting Viterbi parsing. IEEE Trans Pattern Anal Mach Intell 20(10):1109–1116	es_ES
dc.description.references	Bazzi I, Schwartz R, Makhoul J (1999) An omnifont open-vocabulary OCR system for English and Arabic. IEEE Trans Pattern Anal Mach Intell 21(6):495–504	es_ES
dc.description.references	Erman L, Lesser V (1990) The HEARSAY-II speech understanding system: a tutorial. Readings in Speech Reasoning, pp 235–245	es_ES
dc.description.references	Evermann G (1999) Minimum word error rate decoding. Ph.D. thesis, Churchill College, University of Cambridge	es_ES
dc.description.references	Fischer A, Wuthrich M, Liwicki M, Frinken V, Bunke H, Viehhauser G, Stolz M (2009) Automatic transcription of handwritten medieval documents. In: 15th international conference on virtual systems and multimedia, 2009. VSMM ’09, pp 137–142	es_ES
dc.description.references	Frinken V, Fischer A, Manmatha R, Bunke H (2012) A novel word spotting method based on recurrent neural networks. IEEE Trans Pattern Anal Mach Intell 34(2):211–224	es_ES
dc.description.references	Furcy D, Koenig S (2005) Limited discrepancy beam search. In: Proceedings of the 19th international joint conference on artificial intelligence, IJCAI’05, pp 125–131	es_ES
dc.description.references	Granell E, Martínez-Hinarejos CD (2015) Multimodal output combination for transcribing historical handwritten documents. In: 16th international conference on computer analysis of images and patterns, CAIP 2015, chap, pp 246–260. Springer International Publishing	es_ES
dc.description.references	Hakkani-Tr D, Bchet F, Riccardi G, Tur G (2006) Beyond ASR 1-best: using word confusion networks in spoken language understanding. Comput Speech Lang 20(4):495–514	es_ES
dc.description.references	Jelinek F (1998) Statistical methods for speech recognition. MIT Press, Cambridge	es_ES
dc.description.references	Jurafsky D, Martin JH (2009) Speech and language processing: an introduction to natural language processing, speech recognition, and computational linguistics, 2nd edn. Prentice-Hall, Englewood Cliffs	es_ES
dc.description.references	Kneser R, Ney H (1995) Improved backing-off for N-gram language modeling. In: International conference on acoustics, speech and signal processing (ICASSP ’95), vol 1, pp 181–184. IEEE Computer Society	es_ES
dc.description.references	Liu P, Soong FK (2006) Word graph based speech recognition error correction by handwriting input. In: Proceedings of the 8th international conference on multimodal interfaces, ICMI ’06, pp 339–346. ACM	es_ES
dc.description.references	Lowerre BT (1976) The harpy speech recognition system. Ph.D. thesis, Pittsburgh, PA	es_ES
dc.description.references	Luján-Mares M, Tamarit V, Alabau V, Martínez-Hinarejos CD, Pastor M, Sanchis A, Toselli A (2008) iATROS: a speech and handwritting recognition system. In: V Jornadas en Tecnologías del Habla (VJTH’2008), pp 75–78	es_ES
dc.description.references	Mangu L, Brill E, Stolcke A (2000) Finding consensus in speech recognition: word error minimization and other applications of confusion networks. Comput Speech Lang 14(4):373–400	es_ES
dc.description.references	Manning CD, Raghavan P, Schutze H (2008) Introduction to information retrieval. Cambridge University Press, New York	es_ES
dc.description.references	Mohri M, Pereira F, Riley M (2002) Weighted finite-state transducers in speech recognition. Comput Speech Lang 16(1):69–88	es_ES
dc.description.references	Odell JJ, Valtchev V, Woodland PC, Young SJ (1994) A one pass decoder design for large vocabulary recognition. In: Proceedings of the workshop on human language technology, HLT ’94, pp 405–410. Association for Computational Linguistics	es_ES
dc.description.references	Oerder M, Ney H (1993) Word graphs: an efficient interface between continuous-speech recognition and language understanding. IEEE Int Conf Acoust Speech Signal Process 2:119–122	es_ES
dc.description.references	Olivie J, Christianson C, McCarry J (eds) (2011) Handbook of natural language processing and machine translation. Springer, Berlin	es_ES
dc.description.references	Ortmanns S, Ney H, Aubert X (1997) A word graph algorithm for large vocabulary continuous speech recognition. Comput Speech Lang 11(1):43–72	es_ES
dc.description.references	Padmanabhan M, Saon G, Zweig G (2000) Lattice-based unsupervised MLLR for speaker adaptation. In: ASR2000-automatic speech recognition: challenges for the New Millenium ISCA Tutorial and Research Workshop (ITRW)	es_ES
dc.description.references	Pesch H, Hamdani M, Forster J, Ney H (2012) Analysis of preprocessing techniques for latin handwriting recognition. In: International conference on frontiers in handwriting recognition, ICFHR’12, pp 280–284	es_ES
dc.description.references	Povey D, Ghoshal A, Boulianne G, Burget L, Glembek O, Goel N, Hannemann M, Motlicek P, Qian Y, Schwarz P, Silovsky J, Stemmer G, Vesely K (2011) The Kaldi speech recognition toolkit. In: IEEE 2011 workshop on automatic speech recognition and understanding. IEEE Signal Processing Society	es_ES
dc.description.references	Povey D, Hannemann M, Boulianne G, Burget L, Ghoshal A, Janda M, Karafiat M, Kombrink S, Motlcek P, Qian Y, Riedhammer K, Vesely K, Vu NT (2012) Generating Exact Lattices in the WFST Framework. In: IEEE international conference on acoustics, speech, and signal processing (ICASSP)	es_ES
dc.description.references	Rabiner L (1989) A tutorial of hidden Markov models and selected application in speech recognition. Proc IEEE 77:257–286	es_ES
dc.description.references	Robertson S (2008) A new interpretation of average precision. In: Proceedings of the international ACM SIGIR conference on research and development in information retrieval (SIGIR ’08), pp 689–690. ACM	es_ES
dc.description.references	Romero V, Toselli AH, Rodríguez L, Vidal E (2007) Computer assisted transcription for ancient text images. Proc Int Conf Image Anal Recogn LNCS 4633:1182–1193	es_ES
dc.description.references	Romero V, Toselli AH, Vidal E (2012) Multimodal interactive handwritten text transcription. Series in machine perception and artificial intelligence (MPAI). World Scientific Publishing, Singapore	es_ES
dc.description.references	Rybach D, Gollan C, Heigold G, Hoffmeister B, Lööf J, Schlüter R, Ney H (2009) The RWTH aachen university open source speech recognition system. In: Interspeech, pp 2111–2114	es_ES
dc.description.references	Sánchez J, Mühlberger G, Gatos B, Schofield P, Depuydt K, Davis R, Vidal E, de Does J (2013) tranScriptorium: an European project on handwritten text recognition. In: DocEng, pp 227–228	es_ES
dc.description.references	Saon G, Povey D, Zweig G (2005) Anatomy of an extremely fast LVCSR decoder. In: INTERSPEECH, pp 549–552	es_ES
dc.description.references	Strom N (1995) Generation and minimization of word graphs in continuous speech recognition. In: Proceedings of IEEE workshop on ASR’95, pp 125–126. Snowbird, Utah	es_ES
dc.description.references	Tanha J, de Does J, Depuydt K (2015) Combining higher-order N-grams and intelligent sample selection to improve language modeling for Handwritten Text Recognition. In: ESANN 2015 proceedings, European symposium on artificial neural networks, computational intelligence and machine learning, pp 361–366	es_ES
dc.description.references	Toselli A, Romero V, i Gadea MP, Vidal E (2010) Multimodal interactive transcription of text images. Pattern Recogn 43(5):1814–1825	es_ES
dc.description.references	Toselli A, Romero V, Vidal E (2015) Word-graph based applications for handwriting documents: impact of word-graph size on their performances. In: Paredes R, Cardoso JS, Pardo XM (eds) Pattern recognition and image analysis. Lecture Notes in Computer Science, vol 9117, pp 253–261. Springer International Publishing	es_ES
dc.description.references	Toselli AH, Juan A, Keysers D, Gonzlez J, Salvador I, Ney H, Vidal E, Casacuberta F (2004) Integrated handwriting recognition and interpretation using finite-state models. Int J Pattern Recogn Artif Intell 18(4):519–539	es_ES
dc.description.references	Toselli AH, Vidal E (2013) Fast HMM-Filler approach for key word spotting in handwritten documents. In: Proceedings of the 12th international conference on document analysis and recognition (ICDAR’13). IEEE Computer Society	es_ES
dc.description.references	Toselli AH, Vidal E, Romero V, Frinken V (2013) Word-graph based keyword spotting and indexing of handwritten document images. Technical report, Universitat Politècnica de València	es_ES
dc.description.references	Ueffing N, Ney H (2007) Word-level confidence estimation for machine translation. Comput Linguist 33(1):9–40. doi: 10.1162/coli.2007.33.1.9	es_ES
dc.description.references	Vinciarelli A, Bengio S, Bunke H (2004) Off-line recognition of unconstrained handwritten texts using HMMs and statistical language models. IEEE Trans Pattern Anal Mach Intell 26(6):709–720	es_ES
dc.description.references	Weng F, Stolcke A, Sankar A (1998) Efficient lattice representation and generation. In: Proceedings of ICSLP, pp 2531–2534	es_ES
dc.description.references	Wessel F, Schluter R, Macherey K, Ney H (2001) Confidence measures for large vocabulary continuous speech recognition. IEEE Trans Speech Audio Process 9(3):288–298	es_ES
dc.description.references	Wolf J, Woods W (1977) The HWIM speech understanding system. In: IEEE international conference on acoustics, speech, and signal processing, ICASSP ’77, vol 2, pp 784–787	es_ES
dc.description.references	Woodland P, Leggetter C, Odell J, Valtchev V, Young S (1995) The 1994 HTK large vocabulary speech recognition system. In: International conference on acoustics, speech, and signal processing (ICASSP ’95), vol 1, pp 73 –76	es_ES
dc.description.references	Young S, Odell J, Ollason D, Valtchev V, Woodland P (1997) The HTK book: hidden Markov models toolkit V2.1. Cambridge Research Laboratory Ltd, Cambridge	es_ES
dc.description.references	Young S, Russell N, Thornton J (1989) Token passing: a simple conceptual model for connected speech recognition systems. Technical report	es_ES
dc.description.references	Zhu M (2004) Recall, precision and average precision. Working Paper 2004–09 Department of Statistics and Actuarial Science, University of Waterloo	es_ES
dc.description.references	Zimmermann M, Bunke H (2004) Optimizing the integration of a statistical language model in hmm based offline handwritten text recognition. In: Proceedings of the 17th international conference on pattern recognition, 2004. ICPR 2004, vol 2, pp 541–544	es_ES

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem

Word graphs size impact on the performance of handwriting document applications

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Buscar en RiuNet

Listar

Todo RiuNet

Esta colección

Mi cuenta

Estadísticas

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Word graphs size impact on the performance of handwriting document applications

Ficheros en el ítem

Este ítem aparece en la(s) siguiente(s) colección(ones)