Villegas, M.; Toselli, AH.; Romero Gómez, V.; Vidal, E. (2016). Exploiting Existing Modern Transcripts for Historical Handwritten Text Recognition. IEEE. https://doi.org/10.1109/ICFHR.2016.22
Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10251/87627
Título:
|
Exploiting Existing Modern Transcripts for Historical Handwritten Text Recognition
|
Autor:
|
Villegas, Mauricio
Toselli, Alejandro Héctor
Romero Gómez, Verónica
Vidal, Enrique
|
Entidad UPV:
|
Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació
Universitat Politècnica de València. Departamento de Estadística e Investigación Operativa Aplicadas y Calidad - Departament d'Estadística i Investigació Operativa Aplicades i Qualitat
|
Fecha difusión:
|
|
Resumen:
|
[EN] Existing transcripts for historic manuscripts are a very valuable resource for training models useful for automatic recognition, aided transcription, and/or indexing of the remaining untranscribed parts of these ...[+]
[EN] Existing transcripts for historic manuscripts are a very valuable resource for training models useful for automatic recognition, aided transcription, and/or indexing of the remaining untranscribed parts of these collections. However, these existing transcripts generally exhibit two main problems which hinder their convenience: a) text of the transcripts is seldom aligned with manuscript lines, and b) text often deviate very significantly from what can be seen in the manuscript, either because writing style has been modernized or abbreviations have been expanded, or both. This work presents an analysis of these problems and discusses possible solutions for minimizing human effort needed to adapt existing transcripts in order to render them usable. Empirical results presented show the huge performance gain that can be obtained by adequately adapting the transcripts, thus motivating future development of the proposed solutions.
[-]
|
Palabras clave:
|
Handwritten Text Recognition
,
Historical Manuscripts
,
Modernized Transcripts
,
Transcript-image Alignment
,
Diplomatization
|
Derechos de uso:
|
Reserva de todos los derechos
|
ISBN:
|
978-1-5090-0981-7
|
Fuente:
|
|
DOI:
|
10.1109/ICFHR.2016.22
|
Editorial:
|
IEEE
|
Versión del editor:
|
https://doi.org/10.1109/ICFHR.2016.0025
|
Título del congreso:
|
15th International Conference on Frontiers in Handwriting Recognition (ICFHR 2016)
|
Lugar del congreso:
|
Shenzhen, China
|
Fecha congreso:
|
October, 23-26, 2016
|
Código del Proyecto:
|
info:eu-repo/grantAgreement/MINECO//PCIN-2015-068/ES/INDEXACION DE MANUSCRITOS HISTORICOS PARA BUSQUEDAS CONTROLADAS POR EL USUARIO/
info:eu-repo/grantAgreement/EC/H2020/674943/EU/Recognition and Enrichment of Archival Documents/
info:eu-repo/grantAgreement/MINECO//TIN2015-70924-C2-1-R/ES/CONTEXTO, MULTIMODALIDAD Y COLABORACION DEL USUARIO EN PROCESADO DE TEXTO MANUSCRITO/
|
Descripción:
|
© 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
|
Agradecimientos:
|
We are very grateful to Carlos Lechner and Celio Hernández who helped in the creation of the ground truth of the Alcaraz dataset. This work has been partially supported by the European Union (EU) Horizon 2020 grant READ ...[+]
We are very grateful to Carlos Lechner and Celio Hernández who helped in the creation of the ground truth of the Alcaraz dataset. This work has been partially supported by the European Union (EU) Horizon 2020 grant READ (Recognition and Enrichment of Archival Documents) (Ref: 674943), EU project HIMANIS (JPICH programme, Spanish grant Ref: PCIN-2015-068) and MINECO/FEDER, UE under project TIN2015-70924-C2-1-R.
[-]
|
Tipo:
|
Comunicación en congreso
|