Named entity recognition in handwritten text images from the k best transcripts

Giner Pérez de Lucía, José

RiuNet repositorio UPV
:
Docencia
:
Trabajos académicos
:
Servicio de alumnado - Trabajos académicos
:
Ver ítem

Identificarse

Buscar en RiuNet

Listar

Todo RiuNet
Esta colección

Mi cuenta

Acceder

Estadísticas

Ver Estadísticas de uso

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Named entity recognition in handwritten text images from the k best transcripts

Mostrar el registro completo del ítem

Giner Pérez De Lucía, J. (2023). Named entity recognition in handwritten text images from the k best transcripts. Universitat Politècnica de València. http://hdl.handle.net/10251/202314

Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10251/202314

Ficheros en el ítem

Nombre: Giner - Named entitiy ...

Tamaño: 10.93Mb

Formato: PDF

Abrir/Preview

Metadatos del ítem

Título:

Named entity recognition in handwritten text images from the k best transcripts

Otro titulo:

Reconocimiento de entidades nombradas en imágenes de textos manuscritos a partir de las k mejores transcripciones
Reconeixement d'entitats nombrades en imatges de textos manuscrits a partir de les k millors transcripcions

Autor:

Giner Pérez de Lucía, José

Director(es):

Sánchez Peiró, Joan Andreu

Entidad UPV:

Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació

Fecha acto/lectura:

2023-12-18

Fecha difusión:

2024-02-05

Resumen:

[ES] El reconocimiento de entidades nombradas es un problema relevante en tareas de Procesamiento de Lenguaje Natural. Por lo que resulta de especial relevancia en el reconocimiento de imágenes de textos manuscritos. Este ...[+]

[EN] Named Entity Recognition (NER) in ancient handwritten texts is a challenging area of research in the field of artificial intelligence and natural language processing. It consists of identifying and classifying specific entities, such as names of people, places, dates, organizations, etc., in handwritten texts in different ancient languages. This process in- volves several challenges due to the variable nature of handwriting, the evolution of lan- guage over time, inconsistent spelling, and the presence of abbreviations, among other factors. To address these challenges, image processing and deep learning techniques are applied, including convolutional and recurrent neural networks, which have demon- strated to be able to get competitive results for NER over the last decade. By means of this master’s thesis, a recurrent neural network has been designed for NER in manuscript texts of the XVI century belonging to some pages of the ancient col- lection of Books of records of royal decrees, located in the General Archive of Simancas, one of the most important archives that narrates the cultural, political and social evolu- tion of Spain throughout history. Specifically, the neural model learns distributed repre- sentations of words and characters, (referred as embeddings), composed of bidirectional short and long term memory modules (Bi-LSTM) and a conditional random field (CRF) iv as output layer. The results obtained at line level for the manual transcriptions reflect generally good recognition performances on all types of entities (F1 score of 0.83 and WTER of 5.4%), and specifically on person names and surnames (F1 scores of 0.94 and 0.89 respectively). On the other hand, the model has been evaluated with the k-best tran- scriptions of each line generated by a handwritten text recognition process, which may fail to detect certain words present in the manuscripts. A 12% increase in WTER and a 0.20 drop in F1 score was detected for the best decodings (known as the 1-best), and a 7% increase in WTER and a 0.09 drop in F1 score was detected after considering the 10 best decodings (10-best). [-]

Palabras clave:

Campo aleatorio condicional (CRF) , Bi-LSTM , Módulos de memoria bidireccionales , Reconocimiento de entidades nombradas , Aprendizaje profundo, , k-mejores transcripciones , Estidades nombradas , Texto manuscrito antiguo , K-best transcriptions , Deep learning , Named entity recognition

Derechos de uso:

Reserva de todos los derechos

Editorial:

Universitat Politècnica de València

Titulación:

Máster Universitario en Inteligencia Artificial, Reconocimiento de Formas e Imagen Digital-Màster Universitari en Intel·ligència Artificial, Reconeixement de Formes i Imatge Digital

Tipo:

Tesis de máster

recommendations

Este ítem aparece en la(s) siguiente(s) colección(ones)

Servicio de alumnado - Trabajos académicos [7391]

Mostrar el registro completo del ítem

Named entity recognition in handwritten text images from the k best transcripts

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Buscar en RiuNet

Listar

Todo RiuNet

Esta colección

Mi cuenta

Estadísticas

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Named entity recognition in handwritten text images from the k best transcripts

Ficheros en el ítem

Metadatos del ítem

recommendations

Este ítem aparece en la(s) siguiente(s) colección(ones)