- -

Towards the Natural Language Processing as Spelling Correction for Offline Handwritten Text Recognition Systems

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

Towards the Natural Language Processing as Spelling Correction for Offline Handwritten Text Recognition Systems

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author Arthur Flor de Sousa Neto es_ES
dc.contributor.author Byron L. D. Bezerra es_ES
dc.contributor.author Toselli, Alejandro Héctor es_ES
dc.date.accessioned 2024-05-23T18:05:27Z
dc.date.available 2024-05-23T18:05:27Z
dc.date.issued 2020-11 es_ES
dc.identifier.uri http://hdl.handle.net/10251/204390
dc.description.abstract [EN] The increasing portability of physical manuscripts to the digital environment makes it common for systems to offer automatic mechanisms for offline Handwritten Text Recognition (HTR). However, several scenarios and writing variations bring challenges in recognition accuracy, and, to minimize this problem, optical models can be used with language models to assist in decoding text. Thus, with the aim of improving results, dictionaries of characters and words are generated from the dataset and linguistic restrictions are created in the recognition process. In this way, this work proposes the use of spelling correction techniques for text post-processing to achieve better results and eliminate the linguistic dependence between the optical model and the decoding stage. In addition, an encoder-decoder neural network architecture in conjunction with a training methodology are developed and presented to achieve the goal of spelling correction. To demonstrate the effectiveness of this new approach, we conducted an experiment on five datasets of text lines, widely known in the field of HTR, three state-of-the-art Optical Models for text recognition and eight spelling correction techniques, among traditional statistics and current approaches of neural networks in the field of Natural Language Processing (NLP). Finally, our proposed spelling correction model is analyzed statistically through HTR system metrics, reaching an average sentence correction of 54% higher than the state-of-the-art method of decoding in the tested datasets. es_ES
dc.description.sponsorship This research was financed in part by the Coordenacao de Aperfeicoamento de Pessoal de Nivel Superior-Brasil (CAPES)-Finance Code 001, and CNPq Grant No. 315251/2018-2. es_ES
dc.language Inglés es_ES
dc.publisher MDPI AG es_ES
dc.relation info:eu-repo/grantAgreement/CAPES//001/
dc.relation info:eu-repo/grantAgreement/CNPq//315251%2F2018-2
dc.relation.ispartof Applied Sciences es_ES
dc.rights Reconocimiento (by) es_ES
dc.subject Deep learning es_ES
dc.subject Offline handwritten text recognition es_ES
dc.subject Natural language processing es_ES
dc.subject Encoder decoder model es_ES
dc.subject Spelling correction es_ES
dc.title Towards the Natural Language Processing as Spelling Correction for Offline Handwritten Text Recognition Systems es_ES
dc.type Artículo es_ES
dc.identifier.doi 10.3390/app10217711 es_ES
dc.rights.accessRights Abierto es_ES
dc.description.bibliographicCitation Arthur Flor de Sousa Neto; Byron L. D. Bezerra; Toselli, AH. (2020). Towards the Natural Language Processing as Spelling Correction for Offline Handwritten Text Recognition Systems. Applied Sciences. 10(21). https://doi.org/10.3390/app10217711 es_ES
dc.description.accrualMethod S es_ES
dc.relation.publisherversion https://doi.org/10.3390/app10217711 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.description.volume 10 es_ES
dc.description.issue 21 es_ES
dc.identifier.eissn 2076-3417 es_ES
dc.relation.pasarela S\466382 es_ES
dc.contributor.funder Conselho Nacional de Desenvolvimento Científico e Tecnológico, Brasil
dc.contributor.funder Coordenaçao de Aperfeiçoamento de Pessoal de Nível Superior, Brasil


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem