Mostrar el registro sencillo del ítem
dc.contributor.author | Toselli, Alejandro Héctor | es_ES |
dc.contributor.author | Leiva, Luis A. | es_ES |
dc.contributor.author | Bordes-Cabrera, Isabel | es_ES |
dc.contributor.author | Hernández-Tornero, Celio | es_ES |
dc.contributor.author | BOSCH CAMPOS, VICENTE | es_ES |
dc.contributor.author | Vidal, Enrique | es_ES |
dc.date.accessioned | 2020-10-06T03:32:03Z | |
dc.date.available | 2020-10-06T03:32:03Z | |
dc.date.issued | 2018-04 | es_ES |
dc.identifier.issn | 2055-7671 | es_ES |
dc.identifier.uri | http://hdl.handle.net/10251/151164 | |
dc.description.abstract | [EN] We present a process for cost-effective transcription of cursive handwritten text images that has been tested on a 1,000-page 17th-century book about botanical species. The process comprised two main tasks, namely: (1) preprocessing: page layout analysis, text line detection, and extraction; and (2) transcription of the extracted text line images. Both tasks were carried out with semiautomatic pro- cedures, aimed at incrementally minimizing user correction effort, by means of computer-assisted line detection and interactive handwritten text recognition technologies. The contribution derived from this work is three-fold. First, we provide a detailed human-supervised transcription of a relatively large historical handwritten book, ready to be searchable, indexable, and accessible to cultural heritage scholars as well as the general public. Second, we have conducted the first longitudinal study to date on interactive handwriting text recognition, for which we provide a very comprehensive user assessment of the real-world per- formance of the technologies involved in this work. Third, as a result of this process, we have produced a detailed transcription and document layout infor- mation (i.e. high-quality labeled data) ready to be used by researchers working on automated technologies for document analysis and recognition. | es_ES |
dc.description.sponsorship | This work is supported by the European Commission through the EU projects HIMANIS (JPICH program, Spanish, grant Ref. PCIN-2015-068) and READ (Horizon-2020 program, grant Ref. 674943); and the Universitat Politecnica de Valencia (grant number SP20130189). This work was also part of the Valorization and I+D+i Resources program of VLC/CAMPUS and has been funded by the Spanish MECD as part of the International Excellence Campus program. | es_ES |
dc.language | Inglés | es_ES |
dc.publisher | Oxford University Press | es_ES |
dc.relation.ispartof | Digital Scholarship in the Humanities | es_ES |
dc.rights | Reserva de todos los derechos | es_ES |
dc.subject | Handwriting recognition | es_ES |
dc.subject | Images | es_ES |
dc.subject | Models | es_ES |
dc.subject.classification | ESTADISTICA E INVESTIGACION OPERATIVA | es_ES |
dc.subject.classification | LENGUAJES Y SISTEMAS INFORMATICOS | es_ES |
dc.title | Transcribing a 17th-century botanical manuscript: Longitudinal evaluation of document layout detection and interactive transcription | es_ES |
dc.type | Artículo | es_ES |
dc.identifier.doi | 10.1093/llc/fqw064 | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/EC/H2020/674943/EU/Recognition and Enrichment of Archival Documents/ | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/UPV//SP20130189/ | es_ES |
dc.rights.accessRights | Abierto | es_ES |
dc.contributor.affiliation | Universitat Politècnica de València. Departamento de Estadística e Investigación Operativa Aplicadas y Calidad - Departament d'Estadística i Investigació Operativa Aplicades i Qualitat | es_ES |
dc.description.bibliographicCitation | Toselli, AH.; Leiva, LA.; Bordes-Cabrera, I.; Hernández-Tornero, C.; Bosch Campos, V.; Vidal, E. (2018). Transcribing a 17th-century botanical manuscript: Longitudinal evaluation of document layout detection and interactive transcription. Digital Scholarship in the Humanities. 33(1):173-202. https://doi.org/10.1093/llc/fqw064 | es_ES |
dc.description.accrualMethod | S | es_ES |
dc.relation.publisherversion | https://doi.org/10.1093/llc/fqw064 | es_ES |
dc.description.upvformatpinicio | 173 | es_ES |
dc.description.upvformatpfin | 202 | es_ES |
dc.type.version | info:eu-repo/semantics/publishedVersion | es_ES |
dc.description.volume | 33 | es_ES |
dc.description.issue | 1 | es_ES |
dc.relation.pasarela | S\338508 | es_ES |
dc.contributor.funder | Universitat Politècnica de València | es_ES |
dc.description.references | Bazzi, I., Schwartz, R., & Makhoul, J. (1999). An omnifont open-vocabulary OCR system for English and Arabic. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(6), 495-504. doi:10.1109/34.771314 | es_ES |
dc.description.references | Causer, T., Tonra, J., & Wallace, V. (2012). Transcription maximized; expense minimized? Crowdsourcing and editing The Collected Works of Jeremy Bentham*. Literary and Linguistic Computing, 27(2), 119-137. doi:10.1093/llc/fqs004 | es_ES |
dc.description.references | Ramel, J. Y., Leriche, S., Demonet, M. L., & Busson, S. (2007). User-driven page layout analysis of historical printed books. International Journal of Document Analysis and Recognition (IJDAR), 9(2-4), 243-261. doi:10.1007/s10032-007-0040-6 | es_ES |
dc.description.references | Romero, V., Fornés, A., Serrano, N., Sánchez, J. A., Toselli, A. H., Frinken, V., … Lladós, J. (2013). The ESPOSALLES database: An ancient marriage license corpus for off-line handwriting recognition. Pattern Recognition, 46(6), 1658-1669. doi:10.1016/j.patcog.2012.11.024 | es_ES |
dc.description.references | Romero, V., Toselli, A. H., & Vidal, E. (2012). Multimodal Interactive Handwritten Text Transcription. Series in Machine Perception and Artificial Intelligence. doi:10.1142/8394 | es_ES |
dc.description.references | Toselli, A. H., Romero, V., Pastor, M., & Vidal, E. (2010). Multimodal interactive transcription of text images. Pattern Recognition, 43(5), 1814-1825. doi:10.1016/j.patcog.2009.11.019 | es_ES |
dc.description.references | Toselli, A. H., Vidal, E., Romero, V., & Frinken, V. (2016). HMM word graph based keyword spotting in handwritten document images. Information Sciences, 370-371, 497-518. doi:10.1016/j.ins.2016.07.063 | es_ES |
dc.description.references | Bunke, H., Bengio, S., & Vinciarelli, A. (2004). Offline recognition of unconstrained handwritten texts using HMMs and statistical language models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(6), 709-720. doi:10.1109/tpami.2004.14 | es_ES |