Mostrar el registro sencillo del ítem
dc.contributor.author | Anitei, Dan | es_ES |
dc.contributor.author | Sánchez Peiró, Joan Andreu | es_ES |
dc.contributor.author | Benedí Ruiz, José Miguel | es_ES |
dc.contributor.author | Noya García, Ernesto | es_ES |
dc.date.accessioned | 2023-12-18T19:03:33Z | |
dc.date.available | 2023-12-18T19:03:33Z | |
dc.date.issued | 2023-08 | es_ES |
dc.identifier.issn | 0167-8655 | es_ES |
dc.identifier.uri | http://hdl.handle.net/10251/200851 | |
dc.description.abstract | [EN] Searching for information in printed scientific documents is a challenging problem that has recently received special attention from the Pattern Recognition research community. Mathematical expressions are complex elements that appear in scientific documents, and developing techniques for locating and recognizing them requires the preparation of datasets that can be used as benchmarks. Most current techniques for dealing with mathematical expressions are based on Machine Learning techniques which require a large amount of annotated data. These datasets must be prepared with ground-truth information for automatic training and testing. However, preparing large datasets with ground-truth is a very expensive and time-consuming task. This paper introduces the IBEM dataset, consisting of scientific documents that have been prepared for mathematical expression recognition and searching. This dataset consists of 600 documents, more than 8200 page images with more than 160000 mathematical expressions. It has been automatically generated from the Image 1 version of the documents and can be enlarged easily. The ground-truth includes the position at the page level and the Image 1 transcript for mathematical expressions both embedded in the text and displayed. This paper also reports a baseline classification experiment with mathematical symbols and a baseline experiment of Mathematical Expression Recognition performed on the IBEM dataset. These experiments aim to provide some benchmarks for comparison purposes so that future users of the IBEM dataset can have a baseline framework. | es_ES |
dc.description.sponsorship | This work has been partially supported by MCIN/AEI/10.13039/50110 0 011033 under the grant PID2020-116813RB-I00; the Generalitat Valenciana under the FPI grant CIACIF/2021/313; and by the support of the Valencian Graduate School and Research Network of Artificial Intelligence. | es_ES |
dc.language | Inglés | es_ES |
dc.publisher | Elsevier | es_ES |
dc.relation.ispartof | Pattern Recognition Letters | es_ES |
dc.rights | Reconocimiento - No comercial - Sin obra derivada (by-nc-nd) | es_ES |
dc.subject | Mathematical expression dataset | es_ES |
dc.subject | Mathematical expression recognition | es_ES |
dc.subject | Mathematical expression retrieval | es_ES |
dc.subject | Mathematical symbols classification | es_ES |
dc.subject.classification | LENGUAJES Y SISTEMAS INFORMATICOS | es_ES |
dc.title | The IBEM dataset: A large printed scientific image dataset for indexing and searching mathematical expressions | es_ES |
dc.type | Artículo | es_ES |
dc.identifier.doi | 10.1016/j.patrec.2023.05.033 | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2020-116813RB-I00/ES/SEARCHING IN THE SIMANCA ARCHIVE/ | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/GENERALITAT VALENCIANA//CIACIF%2F2021%2F313//Indexación y búsqueda de expresiones matemáticas basada en redes neuronales profundas para colecciones masivas de imágenes de documentos científicos/ | es_ES |
dc.rights.accessRights | Abierto | es_ES |
dc.contributor.affiliation | Universitat Politècnica de València. Escola Tècnica Superior d'Enginyeria Informàtica | es_ES |
dc.description.bibliographicCitation | Anitei, D.; Sánchez Peiró, JA.; Benedí Ruiz, JM.; Noya García, E. (2023). The IBEM dataset: A large printed scientific image dataset for indexing and searching mathematical expressions. Pattern Recognition Letters. 172:29-36. https://doi.org/10.1016/j.patrec.2023.05.033 | es_ES |
dc.description.accrualMethod | S | es_ES |
dc.relation.publisherversion | https://doi.org/10.1016/j.patrec.2023.05.033 | es_ES |
dc.description.upvformatpinicio | 29 | es_ES |
dc.description.upvformatpfin | 36 | es_ES |
dc.type.version | info:eu-repo/semantics/publishedVersion | es_ES |
dc.description.volume | 172 | es_ES |
dc.relation.pasarela | S\494886 | es_ES |
dc.contributor.funder | GENERALITAT VALENCIANA | es_ES |
dc.contributor.funder | AGENCIA ESTATAL DE INVESTIGACION | es_ES |
dc.contributor.funder | Instituto Valenciano de Investigación en Inteligencia Artificial | es_ES |
dc.contributor.funder | Universitat Politècnica de València |