- -

Using the MGGI Methodology for Category-based Language Modeling in Handwritten Marriage Licenses Books

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

Using the MGGI Methodology for Category-based Language Modeling in Handwritten Marriage Licenses Books

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author Romero Gómez, Verónica es_ES
dc.contributor.author Fornes, Alicia es_ES
dc.contributor.author Vidal Ruiz, Enrique es_ES
dc.contributor.author Sánchez Peiró, Joan Andreu es_ES
dc.date.accessioned 2017-09-20T11:05:34Z
dc.date.available 2017-09-20T11:05:34Z
dc.date.issued 2016-10-23
dc.identifier.issn 2167-6445
dc.identifier.uri http://hdl.handle.net/10251/87633
dc.description © 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. es_ES
dc.description.abstract Handwritten marriage licenses books have been used for centuries by ecclesiastical and secular institutions to register marriages. The information contained in these historical documents is useful for demography studies and genealogical research, among others. Despite the generally simple structure of the text in these documents, automatic transcription and semantic information extraction is difficult due to the distinct and evolutionary vocabulary, which is composed mainly of proper names that change along the time. In previous works we studied the use of category-based language models to both improve the automatic transcription accuracy and make easier the extraction of semantic information. Here we analyze the main causes of the semantic errors observed in previous results and apply a Grammatical Inference technique known as MGGI to improve the semantic accuracy of the language model obtained. Using this language model, full handwritten text recognition experiments have been carried out, with results supporting the interest of the proposed approach. es_ES
dc.description.sponsorship This work has been partially supported through the European Union’s H2020 grant READ (Ref: 674943), the European project ERC-2010-AdG-20100407-269796, the MINECO/FEDER, UE projects TIN2015-70924-C2-1-R and TIN2015-70924-C2-2-R, and the Ramon y Cajal Fellowship RYC-2014-16831. es_ES
dc.format.extent 6 es_ES
dc.language Inglés es_ES
dc.publisher IEEE es_ES
dc.rights Reserva de todos los derechos es_ES
dc.subject Handwritten Documents es_ES
dc.subject Information extraction es_ES
dc.subject Language modeling es_ES
dc.subject MGGI es_ES
dc.subject Categories-based language model es_ES
dc.subject.classification LENGUAJES Y SISTEMAS INFORMATICOS es_ES
dc.subject.classification ESTADISTICA E INVESTIGACION OPERATIVA es_ES
dc.title Using the MGGI Methodology for Category-based Language Modeling in Handwritten Marriage Licenses Books es_ES
dc.type Comunicación en congreso es_ES
dc.identifier.doi 10.1109/ICFHR.2016.0069
dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/674943/EU/Recognition and Enrichment of Archival Documents/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/MINECO//TIN2015-70924-C2-1-R/ES/CONTEXTO, MULTIMODALIDAD Y COLABORACION DEL USUARIO EN PROCESADO DE TEXTO MANUSCRITO/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/MINECO//TIN2015-70924-C2-2-R/ES/CONTEXTUALIZACION DE CONTENIDOS EN EL RECONOCIMIENTO DE IMAGENES DE DOCUMENTOS DE ARCHIVOS/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/EC/FP7/269796/EU/Five Centuries of Marriages/
dc.relation.projectID info:eu-repo/grantAgreement/MINECO//RYC-2014-16831/ES/RYC-2014-16831/ es_ES
dc.rights.accessRights Abierto es_ES
dc.contributor.affiliation Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació es_ES
dc.contributor.affiliation Universitat Politècnica de València. Departamento de Estadística e Investigación Operativa Aplicadas y Calidad - Departament d'Estadística i Investigació Operativa Aplicades i Qualitat es_ES
dc.description.bibliographicCitation Romero Gómez, V.; Fornes, A.; Vidal Ruiz, E.; Sánchez Peiró, JA. (2016). Using the MGGI Methodology for Category-based Language Modeling in Handwritten Marriage Licenses Books. IEEE. https://doi.org/10.1109/ICFHR.2016.0069 es_ES
dc.description.accrualMethod S es_ES
dc.relation.conferencename 15th International Conference on Frontiers in Handwriting Recognition (ICFHR 2016) es_ES
dc.relation.conferencedate October 23-26, 2016 es_ES
dc.relation.conferenceplace Shenzhen, China es_ES
dc.relation.publisherversion http://ieeexplore.ieee.org/document/7814085/ es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.relation.senia 320623 es_ES
dc.contributor.funder European Commission es_ES
dc.contributor.funder European Research Council es_ES
dc.contributor.funder Ministerio de Economía y Competitividad es_ES
dc.contributor.funder European Regional Development Fund es_ES


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem