Jorge-Cano, Javier; Giménez Pastor, Adrián; Silvestre Cerdà, Joan Albert; Civera Saiz, Jorge; Sanchis Navarro, José Alberto; Juan, Alfons(Institute of Electrical and Electronics Engineers, 2022)
[EN] Although Long-Short Term Memory (LSTM) networks and deep Transformers are now extensively used in offline ASR, it is unclear how best offline systems can be adapted to work with them under the streaming setup. After ...
Jorge-Cano, Javier; Giménez Pastor, Adrián; Baquero-Arnal, Pau; Iranzo-Sánchez, Javier; Pérez-González de Martos, Alejandro Manuel; Garcés Díaz-Munío, Gonçal; Silvestre Cerdà, Joan Albert; Civera Saiz, Jorge; Sanchis Navarro, José Alberto; Juan, Alfons(2021-03-25)
[EN] This paper describes the automatic speech recognition (ASR) systems built by the MLLP-VRAIN research group of Universitat Politecnica de València for the Albayzin-RTVE 2020 Speech-to-Text Challenge.
The primary system ...
[EN] This paper describes the automatic speech recognition (ASR) systems built by the MLLP-VRAIN research group of Universitat Politècnica de València for the Albayzín-RTVE 2020 Speech-to-Text Challenge, and includes an ...
Iranzo-Sánchez, Javier; Jorge-Cano, Javier; Pérez-González de Martos, Alejandro; Giménez, Adrián; Garcés Díaz-Munío, Gonçal; Baquero-Arnal, Pau; Silvestre Cerdà, Joan Albert; Civera Saiz, Jorge; Sanchis Navarro, José Alberto; Juan, Alfons(Association for Computational Linguistics (ACL), 2022-05-27)
[EN] This work describes the participation of the MLLP-VRAIN research group in the two shared tasks of the IWSLT 2022 conference: Simultaneous Speech Translation and Speech-to-Speech Translation. We present our streaming-ready ...
Santamaría Jordá, Jaume(Universitat Politècnica de València, 2022-09-12)
[ES] El modelat acústic és una tasca de processament del llenguatge natural molt activa en intel.ligència artificial, particularment per a reconeixement automàtic de la parla. Recentment, aquesta tasca ha rebut gran atenció ...
Iranzo Sánchez, Javier(Universitat Politècnica de València, 2019-10-30)
[ES] La traducción automática (MT) es un área de investigación sobre el desarrollo de sistemas que traducen textos de manera automática. Actualmente, la mayoría de sistemas de MT utilizan redes neuronales en lo que se ...
Giménez Pastor, Adrián(Universitat Politècnica de València, 2011-10-18)
En este trabajo presentamos un sistema de reconocimiento de palabras manuscritas
basado en HMMs segmentados con mixturas de Bernoulli en los estados. Este sistema es
aplicado al corpus IAM y comparado con otro, basado ...
Del Agua Teba, Miguel Angel; Giménez Pastor, Adrián; Sanchis Navarro, José Alberto; Civera Saiz, Jorge; Juan, Alfons(Institute of Electrical and Electronics Engineers, 2018)
[EN] In the last years, Deep Bidirectional Recurrent Neural Networks (DBRNN) and DBRNN with Long Short-Term Memory cells (DBLSTM) have outperformed the most accurate classifiers for confidence estimation in automatic speech ...
Piqueras Gozalbes, Santiago Romualdo; Del Agua Teba, Miguel Angel; Giménez Pastor, Adrián; Civera Saiz, Jorge; Juan Císcar, Alfonso(Springer International Publishing, 2014)
Online multimedia repositories are growing rapidly. However, language barriers are often difficult to overcome for many of the current and potential users. In this paper we describe a TTS Spanish sys-
tem and we apply it ...
[EN] The cascade approach to Speech Translation (ST) is based on a pipeline that concatenates an Automatic Speech Recognition (ASR) system followed by a Machine Translation (MT) system. Nowadays, state-of-the-art ST systems ...
Del-Agua, Miguel Ángel; Martínez-Villaronga, Adrià; Giménez Pastor, Adrián; Sanchis Navarro, José Alberto; Civera Saiz, Jorge; Juan, Alfons(CHiME, 2016-09-13)
[EN] The MLLP CHiME-4 system is presented in this paper. It
has been built using the transLectures-UPV toolkit (TLK) developed by the MLLP research group which makes use of stateof-the-art automatic speech recognition ...
Del Agua Teba, Miguel Angel; Giménez Pastor, Adrián; Serrano Martinez Santos, Nicolas; Andrés Ferrer, Jesús; Civera Saiz, Jorge; Sanchis Navarro, José Alberto; Juan Císcar, Alfonso(Springer International Publishing, 2014)
Over the past few years, online multimedia educational repositories have increased in number and popularity. The main aim of the transLectures project is to develop cost-effective solutions for producing accurate transcriptions ...
Alkhoury, Ihab; Giménez Pastor, Adrián; Andrés Ferrer, Jesús; Juan Císcar, Alfonso; Sánchez Peiró, Joan Andreu(US National Institute of Standards and Technology (NIST), 2013-08-23)
The NIST Open Handwriting Recognition and
Translation Evaluation 2013 (NIST OpenHaRT’13) is a performance
evaluation assessing technologies that transcribe and
translate text in document images. This evaluation is focused ...
Pérez-González de Martos, Alejandro Manuel; Garcés Díaz-Munío, Gonçal; Giménez Pastor, Adrián; Silvestre Cerdà, Joan Albert; Sanchis Navarro, José Alberto; Civera Saiz, Jorge; Jiménez, Manuel; Turró Ribalta, Carlos; Juan, Alfons(Elsevier, 2021-10)
[EN] The rapid progress of modern AI tools for automatic speech recognition and machine translation is leading to a progressive cost reduction to produce publishable subtitles for educational videos in multiple languages. ...
Silvestre Cerdà, Joan Albert; Del Agua Teba, Miguel Angel; Garcés Díaz-Munío, Gonzalo Vicente; Gascó Mora, Guillem; Giménez Pastor, Adrián; Martínez-Villaronga, Adrià Agustí; Pérez González de Martos, Alejandro Manuel; Sánchez-Cortina, Isaías; Serrano Martínez-Santos, Nicolás; Spencer, Rachel Nadine; Valor Miró, Juan Daniel; Andrés Ferrer, Jesús; Civera Saiz, Jorge; Sanchis Navarro, José Alberto; Juan Císcar, Alfonso(IberSPEECH 2012, 2012-11-21)
transLectures (Transcription and Translation of Video Lectures)
is an EU STREP project in which advanced automatic speech
recognition and machine translation techniques are being tested on large
video lecture repositories. ...
[EN] Bernoulli HMMs are conventional HMMs in which the emission probabilities are modeled with Bernoulli mixtures. They have recently been applied, with good results, in off-line text recognition in many languages, in ...