- -

Study of the influence of lexicon and language restrictions on computer assisted transcription of historical manuscripts

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

Study of the influence of lexicon and language restrictions on computer assisted transcription of historical manuscripts

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author GRANELL, EMILIO es_ES
dc.contributor.author Romero, Verónica es_ES
dc.contributor.author Martínez-Hinarejos, Carlos-D. es_ES
dc.date.accessioned 2021-11-05T10:17:52Z
dc.date.available 2021-11-05T10:17:52Z
dc.date.issued 2020-05-21 es_ES
dc.identifier.issn 0925-2312 es_ES
dc.identifier.uri http://hdl.handle.net/10251/176078
dc.description.abstract [EN] State-of-the-art Handwritten Text Recognition (HTR) systems allow transcribers to speed-up the transcription of handwritten text images. These systems provide transcribers an initial draft transcription that can be corrected with less effort than transcribing the handwritten text images from scratch. Currently, even the draft transcriptions offered by the most advanced HTR systems contain errors. Therefore, the supervision of this draft by a human transcriber is still necessary to obtain the correct transcription of the handwritten text images. This supervision can be eased by using interactive and assistive transcription systems, where the transcriber and the automatic system cooperate in the amending process. In this paper, the draft transcription is provided by an HTR system based on Convolutional and Recurrent Neural Networks with Bidirectional Long-Short Term Memory units, and the assistive system is fed by lattices generated by using Weighted Finite State Transducers. The influence of the lexicon and language restrictions on the performance of our computer assisted transcription system is evaluated on three historical manuscripts. The transcriptions offered by the proposed HTR system present very low error rates for the studied historical manuscripts. However, our assistive transcription system without lexicon or language restrictions is able to provide an additional reduction on the human effort required to correct the transcriptions in more than 50% over the transcriptions offered by the HTR system. (C) 2020 Elsevier B.V. All rights reserved. es_ES
dc.description.sponsorship Work partially supported by the BBVA Foundation through the 2017-2018 Digital Humanities research grant "Carabela" and the grant "Ayudas Fundacion BBVA a equipos de investigacion cientifica 2018"(PR[18]_HUM_C2_0087), by the Generalitat Valenciana under the EU-FEDER Comunitat Valenciana 2014-2020 grant IDIFEDER/2018/025 "Sistemas de fabricacion inteligente para la industria 4.0"and the grant PROMETEO/2019/121 (DeepPattern), and by the Ministerio de Ciencia/AEI/FEDER/EU through the MIRANDA-DocTIUM project (RTI2018-095645-B-C22) es_ES
dc.language Inglés es_ES
dc.publisher Elsevier es_ES
dc.relation.ispartof Neurocomputing es_ES
dc.rights Reserva de todos los derechos es_ES
dc.subject Handwritten text recognition es_ES
dc.subject Deep learning es_ES
dc.subject Interactive transcription es_ES
dc.subject.classification ESTADISTICA E INVESTIGACION OPERATIVA es_ES
dc.subject.classification LENGUAJES Y SISTEMAS INFORMATICOS es_ES
dc.title Study of the influence of lexicon and language restrictions on computer assisted transcription of historical manuscripts es_ES
dc.type Artículo es_ES
dc.identifier.doi 10.1016/j.neucom.2020.01.081 es_ES
dc.relation.projectID info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/RTI2018-095645-B-C22/ES/TRANSCRIPCION DE DOCUMENTOS CON PLATAFORMAS INTERACTIVAS UBICUAS MULTIMODALES/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/fBBVA//PR[18]_HUM_C2_0087//CARABELA: INDEZACION PROBABILISTICA DE COLECCIONES DE MANUSCRITOS PARA PROTECCION DEL PATRIMONIO HISTORICO SUBACUATICO/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/AEI//RTI2018-095645-B-C22//TRANSCRIPCION DE DOCUMENTOS CON PLATAFORMAS INTERACTIVAS UBICUAS MULTIMODALES/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/GENERALITAT VALENCIANA//PROMETEOII%2F2014%2F030//Adaptive learning and multimodality in machine translation and text transcription/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/EDUC.INVEST.CULT.DEP//IDIFEDER%2F2018%2F025//LABORATORIO DE FABRICACION AVANZADA INTELIGENTE/ es_ES
dc.rights.accessRights Abierto es_ES
dc.contributor.affiliation Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació es_ES
dc.description.bibliographicCitation Granell, E.; Romero, V.; Martínez-Hinarejos, C. (2020). Study of the influence of lexicon and language restrictions on computer assisted transcription of historical manuscripts. Neurocomputing. 390:12-27. https://doi.org/10.1016/j.neucom.2020.01.081 es_ES
dc.description.accrualMethod S es_ES
dc.relation.publisherversion https://doi.org/10.1016/j.neucom.2020.01.081 es_ES
dc.description.upvformatpinicio 12 es_ES
dc.description.upvformatpfin 27 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.description.volume 390 es_ES
dc.relation.pasarela S\409683 es_ES
dc.contributor.funder Fundación BBVA es_ES
dc.contributor.funder GENERALITAT VALENCIANA es_ES
dc.contributor.funder European Regional Development Fund es_ES
dc.contributor.funder Ministerio de Ciencia e Innovación es_ES


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem