Mostrar el registro sencillo del ítem
dc.contributor.author | Toselli, Alejandro Héctor | es_ES |
dc.contributor.author | Vidal, Enrique | es_ES |
dc.contributor.author | Puigcerver, Joan | es_ES |
dc.contributor.author | Noya-García, Ernesto | es_ES |
dc.date.accessioned | 2020-01-09T21:00:51Z | |
dc.date.available | 2020-01-09T21:00:51Z | |
dc.date.issued | 2019-05-02 | es_ES |
dc.identifier.issn | 1433-7541 | es_ES |
dc.identifier.uri | http://hdl.handle.net/10251/134140 | |
dc.description.abstract | [EN] Keyword spotting techniques are becoming cost-effective solutions for information retrieval in handwritten documents. We explore the extension of the single-word, line-level probabilistic indexing approach described in our previous works to allow for page-level search of queries consisting in Boolean combinations of several single-keywords. We propose heuristic rules to combine the single-word relevance probabilities into probabilistically consistent confidence scores of the multi-word boolean combinations. An empirical study, also presented in this paper, evaluates the search performance of word-pair queries involving AND and OR Boolean operations. Results of this study support the proposed approach and clearly show its effectiveness. Finally, a web-based demonstration system based on the proposed methods is presented. | es_ES |
dc.description.sponsorship | This work was partially supported by the Generalitat Valenciana under the Prometeo/2009/014 Project Grant ALMAMATER, Spanish MEC under Grant FPU13/06281, and through the EU projects: HIMANIS (JPICH programme, Spanish grant Ref. PCIN-2015-068) and READ (Horizon-2020 programme, Grant Ref. 674943). | es_ES |
dc.language | Inglés | es_ES |
dc.publisher | Springer-Verlag | es_ES |
dc.relation.ispartof | Pattern Analysis and Applications | es_ES |
dc.rights | Reserva de todos los derechos | es_ES |
dc.subject | Handwritten text processing | es_ES |
dc.subject | Keyword spotting | es_ES |
dc.subject | Multi-word Boolean queries | es_ES |
dc.subject | Image processing | es_ES |
dc.subject | Pattern recognition | es_ES |
dc.subject.classification | LENGUAJES Y SISTEMAS INFORMATICOS | es_ES |
dc.subject.classification | ESTADISTICA E INVESTIGACION OPERATIVA | es_ES |
dc.title | Probabilistic multi-word spotting in handwritten text images | es_ES |
dc.type | Artículo | es_ES |
dc.identifier.doi | 10.1007/s10044-018-0742-z | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/MINECO//PCIN-2015-068/ES/INDEXACION DE MANUSCRITOS HISTORICOS PARA BUSQUEDAS CONTROLADAS POR EL USUARIO/ | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/EC/H2020/674943/EU/Recognition and Enrichment of Archival Documents/ | |
dc.relation.projectID | info:eu-repo/grantAgreement/Generalitat Valenciana//PROMETEO09%2F2009%2F014/ES/Adaptive learning and multimodality in pattern recognition (Almapater)/ | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/MECD//FPU13%2F06281/ES/FPU13%2F06281/ | es_ES |
dc.rights.accessRights | Abierto | es_ES |
dc.contributor.affiliation | Universitat Politècnica de València. Departamento de Estadística e Investigación Operativa Aplicadas y Calidad - Departament d'Estadística i Investigació Operativa Aplicades i Qualitat | es_ES |
dc.contributor.affiliation | Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació | es_ES |
dc.description.bibliographicCitation | Toselli, AH.; Vidal, E.; Puigcerver, J.; Noya-García, E. (2019). Probabilistic multi-word spotting in handwritten text images. Pattern Analysis and Applications. 22(1):23-32. https://doi.org/10.1007/s10044-018-0742-z | es_ES |
dc.description.accrualMethod | S | es_ES |
dc.relation.publisherversion | https://doi.org/10.1007/s10044-018-0742-z | es_ES |
dc.description.upvformatpinicio | 23 | es_ES |
dc.description.upvformatpfin | 32 | es_ES |
dc.type.version | info:eu-repo/semantics/publishedVersion | es_ES |
dc.description.volume | 22 | es_ES |
dc.description.issue | 1 | es_ES |
dc.relation.pasarela | S\372706 | es_ES |
dc.contributor.funder | Generalitat Valenciana | es_ES |
dc.contributor.funder | Ministerio de Educación | es_ES |
dc.contributor.funder | Ministerio de Economía y Empresa | es_ES |
dc.contributor.funder | European Commission | es_ES |
dc.description.references | Andreu Sanchez J, Romero V, Toselli A, Vidal E (2014) ICFHR2014 competition on handwritten text recognition on transcriptorium datasets (HTRtS). In: 14th International conference on frontiers in handwriting recognition (ICFHR), 2014, pp 785–790 | es_ES |
dc.description.references | Bazzi I, Schwartz R, Makhoul J (1999) An omnifont open-vocabulary OCR system for English and Arabic. IEEE Trans Pattern Anal Mach Intell 21(6):495–504 | es_ES |
dc.description.references | Bluche T, Hamel S, Kermorvant C, Puigcerver J, Stutzmann D, Toselli AH, Vidal E (2017) Preparatory KWS experiments for large-scale indexing of a vast medieval manuscript collection in the hIMANIS Project. In: 14th International conference on document analysis and recognition (ICDAR). (Accepted) | es_ES |
dc.description.references | Bluche T, Hamel S, Kermorvant C, Puigcerver J, Stutzmann D, Toselli AH, Vidal E (2017) Preparatory kws experiments for large-scale indexing of a vast medieval manuscript collection in the himanis project. In: 2017 14th IAPR international conference on document analysis and recognition (ICDAR), vol. 01, pp 311–316. https://doi.org/10.1109/ICDAR.2017.59 | es_ES |
dc.description.references | Boole G (1854) An investigation of the laws of thought on which are founded the mathematical theories of logic and probabilities. Macmillan, New York | es_ES |
dc.description.references | Causer T, Wallace V (2012) Building a volunteer community: results and findings from Transcribe Bentham. Digital Humanities Quarterly 6 | es_ES |
dc.description.references | España-Boquera S, Castro-Bleda MJ, Gorbe-Moya J, Zamora-Martinez F (2011) Improving offline handwritten text recognition with hybrid hmm/ann models. IEEE Trans Pattern Anal Mach Intell 33(4):767–779. https://doi.org/10.1109/TPAMI.2010.141 | es_ES |
dc.description.references | Fischer A, Wuthrich M, Liwicki M, Frinken V, Bunke H, Viehhauser G, Stolz M (2009) Automatic transcription of handwritten medieval documents. In: 15th International conference on virtual systems and multimedia, 2009. VSMM ’09, pp 137–142. https://doi.org/10.1109/VSMM.2009.26 | es_ES |
dc.description.references | Fréchet M (1935) Généralisations du théorème des probabilités totales. Seminarjum Matematyczne | es_ES |
dc.description.references | Fréchet M (1951) Sur les tableaux de corrélation dont les marges sont données. Ann Univ Lyon 3 $$^{\wedge }$$ ∧ e ser Sci Sect A 14:53–77 | es_ES |
dc.description.references | Graves A, Liwicki M, Fernández S, Bertolami R, Bunke H, Schmidhuber J (2009) A novel connectionist system for unconstrained handwriting recognition. IEEE Trans Pattern Anal Mach Intell 31(5):855–868 | es_ES |
dc.description.references | Jelinek F (1998) Statistical methods for speech recognition. MIT Press, Cambridge | es_ES |
dc.description.references | Kneser R, Ney H (1995) Improved backing-off for N-gram language modeling. In: International conference on acoustics, speech and signal processing (ICASSP ’95), IEEE Computer Society, Los Alamitos, vol. 1, pp. 181–184, https://doi.org/10.1109/ICASSP.1995.479394 | es_ES |
dc.description.references | Kozielski M, Forster J, Ney H (2012) Moment-based image normalization for handwritten text recognition. In: Proceedings of the 2012 international conference on frontiers in handwriting recognition, ICFHR ’12, pp 256–261. IEEE Computer Society, Washington. https://doi.org/10.1109/ICFHR.2012.236 | es_ES |
dc.description.references | Lavrenko V, Rath TM, Manmatha R (2004) Holistic word recognition for handwritten historical documents. In: First Proceedings of international workshop on document image analysis for libraries, 2004, pp 278–287. https://doi.org/10.1109/DIAL.2004.1263256 | es_ES |
dc.description.references | Manning CD, Raghavan P, Schutze H (2008) Introduction to information retrieval. Cambridge University Press, New York | es_ES |
dc.description.references | Marti UV, Bunke H (2002) The iam-database: an english sentence database for offline handwriting recognition. Int J Doc Anal Recogn 5:39–46. https://doi.org/10.1007/s100320200071 | es_ES |
dc.description.references | Noya-García E, Toselli AH, Vidal E (2017) Simple and effective multi-word query spotting in handwritten text images, pp 76–84. Springer International Publishing, Cham. https://doi.org/10.1007/978-3-319-58838-4_9 | es_ES |
dc.description.references | Pratikakis I, Zagoris K, Gatos B, Louloudis G, Stamatopoulos N (2014) ICFHR 2014 competition on handwritten keyword spotting (h-kws 2014). In: 14th International conference on frontiers in handwriting recognition (ICFHR), 2014, pp 814–819 | es_ES |
dc.description.references | Puigcerver J, Toselli AH, Vidal E (2015) Icdar2015 competition on keyword spotting for handwritten documents. In: 13th international conference on document analysis and recognition (ICDAR), 2015, pp 1176–1180 | es_ES |
dc.description.references | Riba P, Almazn J, Forns A, Fernndez-Mota D, Valveny E, Llads J (2014) e-crowds: a mobile platform for browsing and searching in historical demography-related manuscripts. In: 14th International conference on frontiers in handwriting recognition (ICFHR), 2014, pp 228–233. https://doi.org/10.1109/ICFHR.2014.46 | es_ES |
dc.description.references | Robertson S (2008) A new interpretation of average precision. In: Proceedings of the international ACM SIGIR conference on research and development in information retrieval (SIGIR ’08), pp 689–690. ACM, New York. https://doi.org/10.1145/1390334.1390453 | es_ES |
dc.description.references | Romero V, Toselli AH, Vidal E (2012) Multimodal interactive handwritten text transcription. Series in machine perception and artificial intelligence (MPAI). World Scientific Publishing, Singapore | es_ES |
dc.description.references | Sánchez JA, Romero V, Toselli AH, Vidal E (2016) ICFHR2016 competition on handwritten text recognition on the READ dataset. In: 15th International conference on frontiers in handwriting recognition (ICFHR’16), pp 630–635. https://doi.org/10.1109/ICFHR.2016.0120 | es_ES |
dc.description.references | Toselli A, Vidal E (2015) Handwritten text recognition results on the Bentham collection with improved classical N-Gram-HMM methods. In: 3rd International workshop on historical document imaging and processing (HIP15), pp 15–22 | es_ES |
dc.description.references | Toselli AH, Juan A, Keysers D, González J, Salvador I, Ney H, Vidal E, Casacuberta F (2004) Integrated Handwriting Recognition and Interpretation using Finite-State Models. Int J Pattern Recogn Artif Intell 18(4):519–539 | es_ES |
dc.description.references | Toselli AH, Vidal E, Romero V, Frinken V (2016) HMM word graph based keyword spotting in handwritten document images. Inf Sci 370(C):497–518. https://doi.org/10.1016/j.ins.2016.07.063 | es_ES |
dc.description.references | Vidal E, Toselli AH, Puigcerver J (2015) High performance query-by-example keyword spotting using query-by-string techniques. In: Proceedings of 13th ICDAR, pp 741–745 | es_ES |
dc.description.references | Vidal E, Toselli AH, Puigcerver J (2017) Lexicon-based probabilistic keyword spotting in handwritten text images (to be published) | es_ES |
dc.description.references | Vinciarelli A, Bengio S, Bunke H (2004) Off-line recognition of unconstrained handwritten texts using HMMs and statistical language models. IEEE Trans Pattern Anal Mach Intell 26(6):709–720 | es_ES |
dc.description.references | Young S, Evermann G, Gales M, Hain T, Kershaw D (2009) The HTK book: hidden markov models toolkit V3.4. Microsoft Corporation and Cambridge Research Laboratory Ltd, Cambridge | es_ES |
dc.description.references | Young S, Odell J, Ollason D, Valtchev V, Woodland P (1997) The HTK book: hidden markov models toolkit V2.1. Cambridge Research Laboratory Ltd, Cambridge | es_ES |
dc.description.references | Zhu M (2004) Recall, precision and average precision. Working paper 2004-09 Department of Statistics and Actuarial Science–University of Waterloo | es_ES |