- -

MLLP-VRAIN Spanish ASR Systems for the Albayzín-RTVE 2020 Speech-to-Text Challenge: Extension

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

MLLP-VRAIN Spanish ASR Systems for the Albayzín-RTVE 2020 Speech-to-Text Challenge: Extension

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author Baquero-Arnal, Pau es_ES
dc.contributor.author Jorge-Cano, Javier es_ES
dc.contributor.author Giménez Pastor, Adrián es_ES
dc.contributor.author Iranzo-Sánchez, Javier es_ES
dc.contributor.author Pérez-González de Martos, Alejandro Manuel es_ES
dc.contributor.author Garcés Díaz-Munío, Gonçal es_ES
dc.contributor.author Silvestre Cerdà, Joan Albert es_ES
dc.contributor.author Civera Saiz, Jorge es_ES
dc.contributor.author Sanchis Navarro, José Alberto es_ES
dc.contributor.author Juan, Alfons es_ES
dc.date.accessioned 2023-06-16T18:02:14Z
dc.date.available 2023-06-16T18:02:14Z
dc.date.issued 2022-01 es_ES
dc.identifier.uri http://hdl.handle.net/10251/194315
dc.description.abstract [EN] This paper describes the automatic speech recognition (ASR) systems built by the MLLP-VRAIN research group of Universitat Politècnica de València for the Albayzín-RTVE 2020 Speech-to-Text Challenge, and includes an extension of the work consisting of building and evaluating equivalent systems under the closed data conditions from the 2018 challenge. The primary system (p-streaming_1500ms_nlt) was a hybrid ASR system using streaming one-pass decoding with a context window of 1.5 seconds. This system achieved 16.0% WER on the test-2020 set. We also submitted three contrastive systems. From these, we highlight the system c2-streaming_600ms_t which, following a similar configuration as the primary system with a smaller context window of 0.6 s, scored 16.9% WER points on the same test set, with a measured empirical latency of 0.81 ± 0.09 s (mean ± stdev). That is, we obtained state-of-the-art latencies for high-quality automatic live captioning with a small WER degradation of 6% relative. As an extension, the equivalent closed-condition systems obtained 23.3% WER and 23.5% WER, respectively. When evaluated with an unconstrained language model, we obtained 19.9% WER and 20.4% WER; i.e., not far behind the top-performing systems with only 5% of the full acoustic data and with the extra ability of being streaming-capable. Indeed, all of these streaming systems could be put into production environments for automatic captioning of live media streams. es_ES
dc.description.sponsorship The research leading to these results has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreements no. 761758 (X5Gon) and 952215 (TAILOR), and Erasmus+ Education programme under grant agreement no. 20-226-093604-SCH (EXPERT); the Government of Spain's grant RTI2018-094879-B-I00 (Multisub) funded by MCIN/AEI/10.13039/501100011033 & "ERDF A way of making Europe", and FPU scholarships FPU14/03981 and FPU18/04135; the Generalitat Valenciana's research project Classroom Activity Recognition (ref. PROMETEO/2019/111), and predoctoral research scholarship ACIF/2017/055; and the Universitat Politecnica de Valencia's PAID-01-17 R&D support programme. es_ES
dc.language Inglés es_ES
dc.publisher MDPI AG es_ES
dc.relation.ispartof Applied Sciences es_ES
dc.rights Reconocimiento (by) es_ES
dc.subject Natural language processing es_ES
dc.subject Automatic speech recognition es_ES
dc.subject Streaming es_ES
dc.subject.classification LENGUAJES Y SISTEMAS INFORMATICOS es_ES
dc.title MLLP-VRAIN Spanish ASR Systems for the Albayzín-RTVE 2020 Speech-to-Text Challenge: Extension es_ES
dc.type Artículo es_ES
dc.identifier.doi 10.3390/app12020804 es_ES
dc.relation.projectID info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/RTI2018-094879-B-I00/ES/SUBTITULACION MULTILINGUE DE CLASES DE AULA Y SESIONES PLENARIAS/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/COMISION DE LAS COMUNIDADES EUROPEA//2020-1-SI01-KA226-SCH-093604//EDUCATIONAL EXPLANATIONS AND PRACTICES IN EMERGENCY REMOTE TEACHING/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/761758/EU es_ES
dc.relation.projectID info:eu-repo/grantAgreement/MECYD//AP2014%2F03981//AYUDA CONTRATO FPU 2014-JORGE CANO/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/952215/EU es_ES
dc.relation.projectID info:eu-repo/grantAgreement/ //FPU18%2F04135//AYUDA PREDOCTORAL FPU-IRANZO SANCHEZ. PROYECTO: NOVEL CONTRIBUTIONS TO NEURAL SPEECH TRANSLATION/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/GENERALITAT VALENCIANA//PROMETEO%2F2019%2F111//CLASSROOM ACTIVITY RECOGNITION/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/GENERALITAT VALENCIANA//ACIF%2F2017%2F055//AYUDA PREDOCTORAL CONSELLERIA-BAQUERO ARNAL/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/UPV//PAID-01-17//Contratos Pre-Doctorales UPV 2017- Subprograma 1/ es_ES
dc.rights.accessRights Abierto es_ES
dc.contributor.affiliation Universitat Politècnica de València. Escola Tècnica Superior d'Enginyeria Informàtica es_ES
dc.contributor.affiliation Universitat Politècnica de València. Escuela Politécnica Superior de Alcoy - Escola Politècnica Superior d'Alcoi es_ES
dc.description.bibliographicCitation Baquero-Arnal, P.; Jorge-Cano, J.; Giménez Pastor, A.; Iranzo-Sánchez, J.; Pérez-González De Martos, AM.; Garcés Díaz-Munío, G.; Silvestre Cerdà, JA.... (2022). MLLP-VRAIN Spanish ASR Systems for the Albayzín-RTVE 2020 Speech-to-Text Challenge: Extension. Applied Sciences. 12(2):1-14. https://doi.org/10.3390/app12020804 es_ES
dc.description.accrualMethod S es_ES
dc.relation.publisherversion https://doi.org/10.3390/app12020804 es_ES
dc.description.upvformatpinicio 1 es_ES
dc.description.upvformatpfin 14 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.description.volume 12 es_ES
dc.description.issue 2 es_ES
dc.identifier.eissn 2076-3417 es_ES
dc.relation.pasarela S\453406 es_ES
dc.contributor.funder GENERALITAT VALENCIANA es_ES
dc.contributor.funder MINISTERIO DE EDUCACION es_ES
dc.contributor.funder AGENCIA ESTATAL DE INVESTIGACION es_ES
dc.contributor.funder European Regional Development Fund es_ES
dc.contributor.funder COMISION DE LAS COMUNIDADES EUROPEA es_ES
dc.contributor.funder Universitat Politècnica de València es_ES
dc.contributor.funder MINISTERIO DE CIENCIA INNOVACION Y UNIVERSIDADES es_ES
dc.subject.ods 04.- Garantizar una educación de calidad inclusiva y equitativa, y promover las oportunidades de aprendizaje permanente para todos es_ES
dc.subject.ods 09.- Desarrollar infraestructuras resilientes, promover la industrialización inclusiva y sostenible, y fomentar la innovación es_ES
dc.subject.ods 10.- Reducir las desigualdades entre países y dentro de ellos es_ES


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem