- -

Handwriting recognition in historical documents using very large vocabularies

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

Handwriting recognition in historical documents using very large vocabularies

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author Frinken, Volkmar es_ES
dc.contributor.author Fischer, Andreas es_ES
dc.contributor.author Martínez-Hinarejos, Carlos-D. es_ES
dc.date.accessioned 2017-02-20T09:43:22Z
dc.date.available 2017-02-20T09:43:22Z
dc.date.issued 2013-08
dc.identifier.isbn 978-1-4503-2115-0
dc.identifier.uri http://hdl.handle.net/10251/78056
dc.description © ACM 2013. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in HIP '13 Proceedings of the 2nd International Workshop on Historical Document Imaging and Processinghttp://dx.doi.org/10.1145/2501115.2501116 es_ES
dc.description.abstract Language models are used in automatic transcription system to resolve ambiguities. This is done by limiting the vocabulary of words that can be recognized as well as estimating the n-gram probability of the words in the given text. In the context of historical documents, a non-unified spelling and the limited amount of written text pose a substantial problem for the selection of the recognizable vocabulary as well as the computation of the word probabilities. In this paper we propose for the transcription of historical Spanish text to keep the corpus for the n-gram limited to a sample of the target text, but expand the vocabulary with words gathered from external resources. We analyze the performance of such a transcription system with different sizes of external vocabularies and demonstrate the applicability and the significant increase in recognition accuracy of using up to 300 thousand external words. es_ES
dc.description.sponsorship This work has been supported by the European project FP7-PEOPLE-2008-IAPP: 230653 the European Research Council’s Advanced Grant ERC-2010-AdG 20100407, the Spanish R&D projects TIN2009-14633-C03-03, RYC-2009-05031, TIN2011-24631, TIN2012-37475-C02-02, MITTRAL (TIN2009-14633-C03-01), Active2Trans (TIN2012-31723) as well as the Swiss National Science Foundation fellowship project PBBEP2_141453. es_ES
dc.format.extent 6 es_ES
dc.language Inglés es_ES
dc.publisher ACM es_ES
dc.rights Reserva de todos los derechos es_ES
dc.subject Historical documents es_ES
dc.subject Handwriting recognition es_ES
dc.subject Language modeling es_ES
dc.subject.classification LENGUAJES Y SISTEMAS INFORMATICOS es_ES
dc.title Handwriting recognition in historical documents using very large vocabularies es_ES
dc.type Comunicación en congreso es_ES
dc.identifier.doi 10.1145/2501115.2501116
dc.relation.projectID info:eu-repo/grantAgreement/EC/FP7/230653/EU/Administrative Document Automate Optimization/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/MICINN//TIN2009-14633-C03-03/ES/Extraccion De Conocimiento De Imagenes De Documentos Con Contenidos Heterogeneos/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/EC/FP7/269796/EU/Five Centuries of Marriages/
dc.relation.projectID info:eu-repo/grantAgreement/SNSF//PBBEP2_141453/CH/Bootstrapping Handwriting Recognition Systems for Historical Documents/
dc.relation.projectID info:eu-repo/grantAgreement/MICINN//RYC-2009-05031/ES/RYC-2009-05031/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/MICINN//TIN2011-24631/ES/TEXTO EN LA CIUDAD - COMPRENSION CENTRADA EN HUMANOS DE TEXTO EN ESCENAS/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/MINECO//TIN2012-37475-C02-02/ES/RECONOCIMIENTO CONTEXTUAL EN DOCUMENTOS ANTIGUOS/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/MICINN//TIN2009-14633-C03-01/ES/Multimodal Interaction For Text Transcription With Adaptive Learning/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/MINECO//TIN2012-31723/ES/INTERACCION ACTIVA PARA TRANSCRIPCION DE HABLA Y TRADUCCION/ es_ES
dc.rights.accessRights Abierto es_ES
dc.contributor.affiliation Universitat Politècnica de València. Escola Tècnica Superior d'Enginyeria Informàtica es_ES
dc.description.bibliographicCitation Frinken, V.; Fischer, A.; Martínez-Hinarejos, C. (2013). Handwriting recognition in historical documents using very large vocabularies. ACM. https://doi.org/10.1145/2501115.2501116 es_ES
dc.description.accrualMethod S es_ES
dc.relation.conferencename 2nd International Workshop on Historical Document Imaging and Processing es_ES
dc.relation.conferencedate August 24-24, 2013 es_ES
dc.relation.conferenceplace Washington, DC, USA es_ES
dc.relation.publisherversion http://dx.doi.org/10.1145/2501115.2501116 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.relation.senia 259932 es_ES
dc.contributor.funder European Commission es_ES
dc.contributor.funder European Research Council es_ES
dc.contributor.funder Ministerio de Economía y Competitividad es_ES
dc.contributor.funder European Regional Development Fund es_ES
dc.contributor.funder Ministerio de Ciencia e Innovación es_ES
dc.contributor.funder Swiss National Science Foundation es_ES


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem