- -

Open set classification of untranscribed handwritten text image documents

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

Open set classification of untranscribed handwritten text image documents

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author Prieto, José Ramón es_ES
dc.contributor.author Flores, Juan José es_ES
dc.contributor.author Vidal, Enrique es_ES
dc.contributor.author Toselli, Alejandro Héctor es_ES
dc.date.accessioned 2024-06-11T18:19:21Z
dc.date.available 2024-06-11T18:19:21Z
dc.date.issued 2023-08 es_ES
dc.identifier.issn 0167-8655 es_ES
dc.identifier.uri http://hdl.handle.net/10251/205018
dc.description.abstract [EN] Content-based classification of manuscripts is an important task that is generally carried out by expert archivists. Nevertheless, many historical manuscript collections are so vast that in most cases this task is hardly feasible, even for large, well staffed archives. Nowadays, manuscripts are generally preserved in the form of sets of digital images. Therefore, the technical problem we are interested in is automatic classification of "'image documents", each consisting of a set of untranscribed handwritten text images, by the textual contents of the images. The traditional Pattern Recognition classification paradigm does provide the basic tools to deal with this problem. However, in practice, the set of relevant classes of a large documental series is seldom known in advance. Therefore, a classifier trained with a predefined set of classes will systematically fail when new image documents arrive which do not belong to any of the classes assumed in training. Here we adopt the "Open Set Classification" framework to extend and consolidate our previous work on image document classification in order to adequately handle new documents from unknown classes. The proposed approaches are based on a relatively novel technology for text image representation known as "probabilistic indexing", which proves very effective to characterise the intrinsic word-level uncertainty exhibited by historical handwritten text images. We assess the performance of this approach on a moderately sized but representative dataset extracted from a huge series of complex notarial manuscripts from the Spanish Archivo Historico Provincial de Cadiz , with good results. es_ES
dc.description.sponsorship Work partially supported by : Universitat Politcnica de Valencia under grant FPI-I/SP20190010, Generalitat Valenciana under project DeepPattern (PROMETEO/2019/121), by grant PID2020116813RBI00 a of MCIN/AEI/10.13039/501100011033 and by a Maria Zambrano grant of the Spanish Ministerio de Universidades and the European Union NextGenerationEU/PRTR. es_ES
dc.language Inglés es_ES
dc.publisher Elsevier es_ES
dc.relation.ispartof Pattern Recognition Letters es_ES
dc.rights Reconocimiento (by) es_ES
dc.subject Open set document classification es_ES
dc.subject Handwritten text images es_ES
dc.subject Probabilistic indexing es_ES
dc.subject Neural networks es_ES
dc.title Open set classification of untranscribed handwritten text image documents es_ES
dc.type Artículo es_ES
dc.identifier.doi 10.1016/j.patrec.2023.06.006 es_ES
dc.relation.projectID info:eu-repo/grantAgreement/AGENCIA ESTATAL DE INVESTIGACION//PID2020-116813RB-I00//SEARCHING IN THE SIMANCA ARCHIVE/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/UPV//SP20190010/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/GVA//PROMETEO%2F2019%2F121//Deep learning for adaptative and multimodal interaction in pattern recognition/ es_ES
dc.rights.accessRights Abierto es_ES
dc.description.bibliographicCitation Prieto, JR.; Flores, JJ.; Vidal, E.; Toselli, AH. (2023). Open set classification of untranscribed handwritten text image documents. Pattern Recognition Letters. 172:113-120. https://doi.org/10.1016/j.patrec.2023.06.006 es_ES
dc.description.accrualMethod S es_ES
dc.relation.publisherversion https://doi.org/10.1016/j.patrec.2023.06.006 es_ES
dc.description.upvformatpinicio 113 es_ES
dc.description.upvformatpfin 120 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.description.volume 172 es_ES
dc.relation.pasarela S\495648 es_ES
dc.contributor.funder European Commission es_ES
dc.contributor.funder Generalitat Valenciana es_ES
dc.contributor.funder AGENCIA ESTATAL DE INVESTIGACION es_ES
dc.contributor.funder Universitat Politècnica de València es_ES


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem