- -

Page Segmentation of Structured Documents Using 2D Stochastic Context-Free Grammars

RiuNet: Institutional repository of the Polithecnic University of Valencia

Share/Send to

Cited by

Statistics

Page Segmentation of Structured Documents Using 2D Stochastic Context-Free Grammars

Show simple item record

Files in this item

dc.contributor.author Álvaro Muñoz, Francisco es_ES
dc.contributor.author Cruz Fernández, Francisco es_ES
dc.contributor.author Sánchez Peiró, Joan Andreu es_ES
dc.contributor.author Ramos Terrades, Oriol es_ES
dc.contributor.author Benedí Ruiz, José Miguel es_ES
dc.date.accessioned 2016-10-07T07:25:09Z
dc.date.available 2016-10-07T07:25:09Z
dc.date.issued 2013
dc.identifier.isbn 978-3-642-38627-5
dc.identifier.issn 0302-9743
dc.identifier.uri http://hdl.handle.net/10251/71363
dc.description The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-38628-2_15 es_ES
dc.description.abstract n this paper we define a bidimensional extension of Stochastic Context-Free Grammars for page segmentation of structured documents. Two sets of text classification features are used to perform an initial classification of each zone of the page. Then, the page segmentation is obtained as the most likely hypothesis according to a grammar. This approach is compared to Conditional Random Fields and results show significant improvements in several cases. Furthermore, grammars provide a detailed segmentation that allowed a semantic evaluation which also validates this model. es_ES
dc.description.sponsorship Work partially supported by the Spanish MEC under the STraDA research project (TIN2012-37475-C02-01), the MITTRAL (TIN2009- 14633-C03-01) project, the Spanish projects TIN2009-14633-C03-01/03 and 2010- CONES-00029, the FPU grant (AP2009-4363), by the Generalitat Valenciana under the grant Prometeo/2009/014, and through the EU 7th Framework Programme grant tranScriptorium (Ref: 600707)
dc.format.extent 8 es_ES
dc.language Inglés es_ES
dc.publisher Springer es_ES
dc.relation.ispartof Pattern Recognition and Image Analysis es_ES
dc.relation.ispartofseries Lecture Notes in Computer Science;7887
dc.rights Reserva de todos los derechos es_ES
dc.subject Document segmentation es_ES
dc.subject Stochastic context-free grammars es_ES
dc.subject Text classification features es_ES
dc.subject.classification LENGUAJES Y SISTEMAS INFORMATICOS es_ES
dc.title Page Segmentation of Structured Documents Using 2D Stochastic Context-Free Grammars es_ES
dc.type Capítulo de libro es_ES
dc.type Comunicación en congreso es_ES
dc.identifier.doi 10.1007/978-3-642-38628-2_15
dc.relation.projectID info:eu-repo/grantAgreement/EC/FP7/600707/EU/tranScriptorium/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/MINECO//TIN2012-37475-C02-01/ES/SEARCH IN TRANSCRIBED MANUSCRIPTS AND DOCUMENT AUGMENTATION/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/MICINN//TIN2009-14633-C03-03/ES/Extraccion De Conocimiento De Imagenes De Documentos Con Contenidos Heterogeneos/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/MICINN//TIN2009-14633-C03-01/ES/Multimodal Interaction For Text Transcription With Adaptive Learning/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/ME//AP2009-4363/ES/AP2009-4363/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/GV//Prometeo-2009/014/ es_ES
dc.rights.accessRights Abierto es_ES
dc.contributor.affiliation Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació es_ES
dc.contributor.affiliation Universitat Politècnica de València. Escola Tècnica Superior d'Enginyeria Informàtica es_ES
dc.description.bibliographicCitation Álvaro Muñoz, F.; Cruz Fernández, F.; Sánchez Peiró, JA.; Ramos Terrades, O.; Benedí Ruiz, JM. (2013). Page Segmentation of Structured Documents Using 2D Stochastic Context-Free Grammars. En Pattern Recognition and Image Analysis. Springer. 133-140. https://doi.org/10.1007/978-3-642-38628-2_15 es_ES
dc.description.accrualMethod S es_ES
dc.relation.conferencename 6th Iberian Conference on Pattern Recognition and Image Analysis (IbPRIA 2013) es_ES
dc.relation.conferencedate June 5-7, 2013 es_ES
dc.relation.conferenceplace Funchal, Madeira, Portugal es_ES
dc.relation.publisherversion http://link.springer.com/chapter/10.1007/978-3-642-38628-2_15 es_ES
dc.description.upvformatpinicio 133 es_ES
dc.description.upvformatpfin 140 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.relation.senia 253864 es_ES
dc.contributor.funder Ministerio de Educación
dc.contributor.funder Ministerio de Educación, Cultura y Deporte
dc.contributor.funder Generalitat Valenciana
dc.contributor.funder European Commission
dc.description.references Álvaro, F., Sánchez, J.A., Benedí, J.M.: Recognition of on-line handwritten mathematical expressions using 2d stochastic context-free grammars and hidden markov models. Pattern Recognition Letters (2012) es_ES
dc.description.references An, C., Bird, H.S., Xiu, P.: Iterated document content classification. In: Proc. of ICDAR, Brazil, vol. 1, pp. 252–256 (2007) es_ES
dc.description.references Antonacopoulos, A., Clausner, C., Papadopoulos, C., Pletschacher, S.: Historical document layout analysis competition. In: Proc. of ICDAR, pp. 1516–1520 (2011) es_ES
dc.description.references Bulacu, M., Koert, R., Schomaker, L., Zant, T.: Layout analysis of handwritten historical documents for searching the archive of the cabinet of the dutch queen. In: Proc. of ICDAR, Brazil, vol. 1, pp. 23–26 (2007) es_ES
dc.description.references Crespi Reghizzi, S., Pradella, M.: A CKY parser for picture grammars. Information Processing Letters 105(6), 213–217 (2008) es_ES
dc.description.references Cruz, F., Ramos Terrades, O.: Document segmentation using relative location features. In: Proc. of ICPR, Japan, pp. 1562–1565 (2012) es_ES
dc.description.references Esteve, A., Cortina, C., Cabré, A.: Long term trends in marital age homogamy patterns: Spain, 1992-2006. Population 64(1), 173–202 (2009) es_ES
dc.description.references Gould, S., Rodgers, J., Cohen, D., Elidan, G., Koller, D.: Multi-class segmentation with relative location prior. Int. Journal of Computer Vision 80(3), 300–316 (2008) es_ES
dc.description.references Handley, J.C., Namboodiri, A.M., Zanibbi, R.: Document understanding system using stochastic context-free grammars. In: Proc. of ICDAR, vol. 1, pp. 511–515 (2005) es_ES
dc.description.references Jain, A.K., Namboodiri, A.M., Subrahmonia, J.: Structure in online documents. In: Proc. of ICDAR, vol. 1, pp. 844–848 (2001) es_ES
dc.description.references Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proc. of ICML, USA, pp. 282–289 (2001) es_ES


This item appears in the following Collection(s)

Show simple item record