- -

A Spanish dataset for reproducible benchmarked offline handwriting recognition

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

A Spanish dataset for reproducible benchmarked offline handwriting recognition

Mostrar el registro completo del ítem

España Boquera, S.; Castro-Bleda, MJ. (2022). A Spanish dataset for reproducible benchmarked offline handwriting recognition. Language Resources and Evaluation. 56(3):1009-1022. https://doi.org/10.1007/s10579-022-09587-3

Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10251/198426

Ficheros en el ítem

Metadatos del ítem

Título: A Spanish dataset for reproducible benchmarked offline handwriting recognition
Autor: España Boquera, Salvador Castro-Bleda, Maria Jose
Entidad UPV: Universitat Politècnica de València. Escola Tècnica Superior d'Enginyeria Informàtica
Fecha difusión:
Resumen:
[EN] In this paper, a public dataset for Offline Handwriting Recognition, along with an appropriate evaluation method to provide benchmark indicators at sentence level, is presented. This dataset, called SPA-Sentences, ...[+]
Palabras clave: Handwriting recognition , Offline handwriting recognition , Datasets , Evaluation , Benchmarking , Experimental reproducibility , Spanish resources , Deep learning , Convolutional neural networks (CNN) , Long short term memory (LSTM) networks , Connectionist temporal classification (CTC)
Derechos de uso: Reserva de todos los derechos
Fuente:
Language Resources and Evaluation. (issn: 1574-020X )
DOI: 10.1007/s10579-022-09587-3
Editorial:
Springer-Verlag
Versión del editor: https://doi.org/10.1007/s10579-022-09587-3
Tipo: Artículo

References

Amengual, J. C., Benedí, J. M., Casacuberta, F., Castaño, A., Castellanos, A., Jiménez, V. M., Llorens, D., Marzal, A., Prat, F., Vilar, J.M., Benedí, J.M., Casacuberta, F., Pastor, M., & Vidal. E. (2000). The EUTRANS-I speech translation system. Machine Translation Journal, 15, 75–103.

Amodei, D., Anubhai, R., Battenberg, E., Case, C., Casper, J., Catanzaro, B., Chen, J., Chrzanowski, M., Coates, A., Diamos, G., Elsen, E., Engel, J., Fan, L., Fougner, C., Han, T., Hannun, A., Jun, B., LeGresley, P., Lin, L., Narang, S., Ng, A., Ozair, S., Prenger, R., Raiman, J., Satheesh, S., Seetapun, D., Sengupta, S., Wang, Y., Wang, Z., Wang, C., Xiao, B., Yogatama, D., Zhan, J., & Zhu. Z. (2016). Deep speech 2: End-to-end speech recognition in English and Mandarin. In Proceedings of the 33rd international conference on international conference on machine learning (ICML) (Vol. 48, pp. 173–182). JMLR.org.

Chetlur, S., Woolley, C., Vandermersch, P., Cohen, J., Tran, J., Catanzaro, B., & Shelhamer, E. (2014). cuDNN: Efficient primitives for deep learning. CoRR abs/1410.0759. http://arxiv.org/abs/1410.0759. [+]
Amengual, J. C., Benedí, J. M., Casacuberta, F., Castaño, A., Castellanos, A., Jiménez, V. M., Llorens, D., Marzal, A., Prat, F., Vilar, J.M., Benedí, J.M., Casacuberta, F., Pastor, M., & Vidal. E. (2000). The EUTRANS-I speech translation system. Machine Translation Journal, 15, 75–103.

Amodei, D., Anubhai, R., Battenberg, E., Case, C., Casper, J., Catanzaro, B., Chen, J., Chrzanowski, M., Coates, A., Diamos, G., Elsen, E., Engel, J., Fan, L., Fougner, C., Han, T., Hannun, A., Jun, B., LeGresley, P., Lin, L., Narang, S., Ng, A., Ozair, S., Prenger, R., Raiman, J., Satheesh, S., Seetapun, D., Sengupta, S., Wang, Y., Wang, Z., Wang, C., Xiao, B., Yogatama, D., Zhan, J., & Zhu. Z. (2016). Deep speech 2: End-to-end speech recognition in English and Mandarin. In Proceedings of the 33rd international conference on international conference on machine learning (ICML) (Vol. 48, pp. 173–182). JMLR.org.

Chetlur, S., Woolley, C., Vandermersch, P., Cohen, J., Tran, J., Catanzaro, B., & Shelhamer, E. (2014). cuDNN: Efficient primitives for deep learning. CoRR abs/1410.0759. http://arxiv.org/abs/1410.0759.

Collobert, R., Kavukcuoglu, K., & Farabet, C. (2011). Torch7: A Matlab-like environment for machine learning. In Proceedings of big learning 2011: NIPS 2011 workshop on algorithms, systems, and tools for learning at scale.

Díaz-Verdejo, J. E., Peinado, A. M., Rubio, A. J., Segarra, E., Prieto, N., & Casacuberta, F. (1998). ALBAYZIN: A task-oriented Spanish speech corpus. In Proceedings of the first international conference on language resources and evaluation (LREC) (pp. 497–501). Granada, Spain.

Doetsch, P., Kozielski, M., & Ney, H. (2014). Fast and robust training of recurrent neural networks for offline handwriting recognition. In Proceedings of the 14th international conference on frontiers in handwriting recognition (ICFHR) (pp. 279–284). IEEE.

España Boquera, S., Castro Bleda, M. J., & Hidalgo, J. L. (2004). The SPARTACUS-Database: A Spanish sentence database for offline handwriting recognition. In Proceedings of the fourth international conference on language resources and evaluation (LREC) (pp. 227–230). Lisbon, Portugal.

Fischer, A., Baechler, M., Garz, A., Liwicki, M., & Ingold, R. (2014). A combined system for text line extraction and handwriting recognition in historical documents. In Proceedings of the 11th IAPR international workshop on document analysis systems (DAS) (pp. 71–75). IEEE.

Fischer, A., Indermühle, E., Bunke, H., Viehhauser, G., & Stolz, M. (2010). Ground Truth Creation for Handwriting Recognition in Historical Documents. In Proceedings of the 9th IAPR international workshop on document analysis systems (DAS) (pp. 3–10). ACM, New York, NY, USA. https://doi.org/10.1145/1815330.1815331.

Gers, F. A., Schraudolph, N. N., & Schmidhuber, J. (2002). Learning precise timing with LSTM recurrent networks. Journal of machine learning research, 3(Aug), 115–143.

Graves, A., Fernández, S., Gomez, F., & Schmidhuber, J. (2006). Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In Proceedings of the 23rd international conference on machine learning (ICML) (pp. 369–376). ACM.

Graves, A., Liwicki, M., Fernández, S., Bertolami, R., Bunke, H., & Schmidhuber, J. (2008). A novel connectionist system for unconstrained handwriting recognition. IEEE Transaction on Pattern Analysis and Machine Intelligence, 31(5), 855–868.

Graves, A., & Schmidhuber, J. (2005). Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Networks, 18(5–6), 602–610.

Graves, A., & Schmidhuber, J. (2009). Offline handwriting recognition with multidimensional recurrent neural networks. In Advances in neural information processing systems, pp. 545–552.

Grosicki, E., Carré, M., Brodin, J. M., & Geoffrois, E. (2008). RIMES evaluation campaign for handwritten mail processing. In Proceedings of the 11th international conference on frontiers in handwriting recognition (ICFHR), pp. 1–6. Concordia University, Montreal, Canada. https://hal.archives-ouvertes.fr/hal-01395332.

Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780.

Hull, J. J. (1994). A database for handwritten text recognition research. IEEE Transaction on Pattern Analysis and Machine Intelligence, 16(5), 550–554.

Hussain, R., Raza, A., Siddiqi, I., Khurshid, K., & Djeddi, C. (2015). A comprehensive survey of handwritten document benchmarks: structure, usage and evaluation (p. 46). Image and Video Processing: EURASIP J.

Juan, A., Toselli, A. H., Domnech, J., González, J., Salvador, I., Vidal, E., & Casacuberta, F. (2004). Integrated handwriting recognition and interpretation via finite-state models. International Journal of Pattern Recognition and Artificial Intelligence, 18(04), 519–539.

LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. In Proceedings of the IEEE, 86(11), 2278–2324

Maas, A. L., Hannun, A. Y., & Ng, A. Y. (2013). Rectifier nonlinearities improve neural network acoustic models. In Proceedings of the international conference on international conference on machine learning (ICML) (Vol. 30, p. 3).

Marti, U. V., & Bunke, H. (2002). The IAM-database: An English sentence database for offline handwriting recognition. International Journal on Document Analysis and Recognition, 5, 39–46.

Mocholí Calvo, C., Mocholí-Calvo Mocholí-Calvo, C. Tutored by E. VIdal and J. Puigcerver. (2017–2018). Development and experimentation of a deep learning system for convolutional and recurrent neural networks. Master’s thesis, ETSINF Universitat Politècnica de València, Valencia (Spain).

Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., & Lerer, A. (2017). Automatic differentiation in PyTorch. In Proceedings of the 31st conference on neural information processing systems (NIPS). Long Beach, CA, USA.

Pérez, D., Tarazón, L., Serrano, N., Castro, F., Terrades, O.R., & Juan-Císcar, A. (2009). The GERMANA database. In 10th International conference on document analysis and recognition (pp. 301–305).

Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., Hannemann, M., Motlicek, P., Qian, Y., Schwarz, P., Silovsky, J., Stemmer, G., & Vesely, K. (2011). The Kaldi speech recognition toolkit. Technical report: IEEE signal processing society.

Puigcerver, J. (2017). Are multidimensional recurrent layers really necessary for handwritten text recognition? In Proceedings of the 14th IAPR international conference on document analysis and recognition (ICDAR) (Vol. 01, pp. 67–72). https://doi.org/10.1109/ICDAR.2017.20.

Puigcerver, J., Martin-Albo, D., & Villegas, M. (2016). Laia: A deep learning toolkit for HTR.

Sabir, E., Rawls, S., & Natarajan, P. (2017). Implicit language model in LSTM for OCR. In Proceedings of the 14th IAPR international conference on document analysis and recognition (ICDAR) (Vol. 7, pp. 27–31). IEEE.

Sanchez, J. A., Toselli, A. H., Romero, V., & Vidal, E. (2015). ICDAR 2015 competition HTRtS: Handwritten text recognition on the tranScriptorium dataset. In Proceedings of the 13th international conference on document analysis and recognition (ICDAR).

Shi, B., Bai, X., & Yao, C. (2016). An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Transaction on Pattern Analysis and Machine Intelligence, 39(11), 2298–2304.

Slavik, P., & Govindaraju, V. (2001). Equivalence of Different Methods for Slant and Skew Corrections in Word Recognition Applications. IEEE Transaction on Pattern Analysis and Machine Intelligence, 23(3), 323–326.

Suen, C. Y., Nadal, C., Legault, R., Mai, T. A., & Lam, L. (1992). Computer recognition of unconstrained handwritten numerals. Special Issue of Proceedings of IEEE, 7(80), 1162–1180.

Toselli, A. H., Romero, V., & Vidal, E. (2007). Viterbi based alignment between text images and their transcripts. In Proceedings of the workshop on language technology for cultural heritage data (LaTeCH) (pp. 9–16).

Viard-Gaudin, C., Lallican, P. M., Knerr, S., & Binter, P. (1999). The IRESTE on/off (IRONOFF) dual handwriting database. In Proceedings of the fifth international conference on document analysis and recognition (ICDAR) (pp. 455–458). Bangalore, India.

Wilkinson, R., Geist, J., Janet, S., Grother, P., Burges, C., Creecy, R., Hammond, B., Hull, J., Larsen, N., Vogl, T., & Wilson, C. (1992). The first census optical character recognition systems conference. In #NISTIR 4912. The U.S. Bureau of Census and the National Institute of Standards and Technology, Gaithersburg, MD.

[-]

recommendations

 

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro completo del ítem