Mostrar el registro sencillo del ítem
dc.contributor.author | Baquero-Arnal, Pau | es_ES |
dc.contributor.author | Jorge-Cano, Javier | es_ES |
dc.contributor.author | Giménez Pastor, Adrián | es_ES |
dc.contributor.author | Iranzo-Sánchez, Javier | es_ES |
dc.contributor.author | Pérez-González de Martos, Alejandro Manuel | es_ES |
dc.contributor.author | Garcés Díaz-Munío, Gonçal | es_ES |
dc.contributor.author | Silvestre Cerdà, Joan Albert | es_ES |
dc.contributor.author | Civera Saiz, Jorge | es_ES |
dc.contributor.author | Sanchis Navarro, José Alberto | es_ES |
dc.contributor.author | Juan, Alfons | es_ES |
dc.date.accessioned | 2023-06-16T18:02:14Z | |
dc.date.available | 2023-06-16T18:02:14Z | |
dc.date.issued | 2022-01 | es_ES |
dc.identifier.uri | http://hdl.handle.net/10251/194315 | |
dc.description.abstract | [EN] This paper describes the automatic speech recognition (ASR) systems built by the MLLP-VRAIN research group of Universitat Politècnica de València for the Albayzín-RTVE 2020 Speech-to-Text Challenge, and includes an extension of the work consisting of building and evaluating equivalent systems under the closed data conditions from the 2018 challenge. The primary system (p-streaming_1500ms_nlt) was a hybrid ASR system using streaming one-pass decoding with a context window of 1.5 seconds. This system achieved 16.0% WER on the test-2020 set. We also submitted three contrastive systems. From these, we highlight the system c2-streaming_600ms_t which, following a similar configuration as the primary system with a smaller context window of 0.6 s, scored 16.9% WER points on the same test set, with a measured empirical latency of 0.81 ± 0.09 s (mean ± stdev). That is, we obtained state-of-the-art latencies for high-quality automatic live captioning with a small WER degradation of 6% relative. As an extension, the equivalent closed-condition systems obtained 23.3% WER and 23.5% WER, respectively. When evaluated with an unconstrained language model, we obtained 19.9% WER and 20.4% WER; i.e., not far behind the top-performing systems with only 5% of the full acoustic data and with the extra ability of being streaming-capable. Indeed, all of these streaming systems could be put into production environments for automatic captioning of live media streams. | es_ES |
dc.description.sponsorship | The research leading to these results has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreements no. 761758 (X5Gon) and 952215 (TAILOR), and Erasmus+ Education programme under grant agreement no. 20-226-093604-SCH (EXPERT); the Government of Spain's grant RTI2018-094879-B-I00 (Multisub) funded by MCIN/AEI/10.13039/501100011033 & "ERDF A way of making Europe", and FPU scholarships FPU14/03981 and FPU18/04135; the Generalitat Valenciana's research project Classroom Activity Recognition (ref. PROMETEO/2019/111), and predoctoral research scholarship ACIF/2017/055; and the Universitat Politecnica de Valencia's PAID-01-17 R&D support programme. | es_ES |
dc.language | Inglés | es_ES |
dc.publisher | MDPI AG | es_ES |
dc.relation.ispartof | Applied Sciences | es_ES |
dc.rights | Reconocimiento (by) | es_ES |
dc.subject | Natural language processing | es_ES |
dc.subject | Automatic speech recognition | es_ES |
dc.subject | Streaming | es_ES |
dc.subject.classification | LENGUAJES Y SISTEMAS INFORMATICOS | es_ES |
dc.title | MLLP-VRAIN Spanish ASR Systems for the Albayzín-RTVE 2020 Speech-to-Text Challenge: Extension | es_ES |
dc.type | Artículo | es_ES |
dc.identifier.doi | 10.3390/app12020804 | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/RTI2018-094879-B-I00/ES/SUBTITULACION MULTILINGUE DE CLASES DE AULA Y SESIONES PLENARIAS/ | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/EC/Erasmus+/2020-1-SI01-KA226-SCH-093604/EU/Educational eXplanations and Practices in Emergency Remote Teaching | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/EC/H2020/761758/EU/X5gon: Cross Modal, Cross Cultural, Cross Lingual, Cross Domain, and Cross Site Global OER Network/X5gon | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/MECYD//AP2014%2F03981//AYUDA CONTRATO FPU 2014-JORGE CANO/ | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/EC/H2020/952215/EU/Foundations of Trustworthy AI - Integrating Reasoning, Learning and Optimization/TAILOR | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/MIU//FPU18%2F04135/ES/NOVEL CONTRIBUTIONS TO NEURAL SPEECH TRANSLATION/ | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/GVA//PROMETEO%2F2019%2F111//CLASSROOM ACTIVITY RECOGNITION/ | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/GVA//ACIF%2F2017%2F055/ES/Subvenciones para la contratación de personal investigador de carácter predoctoral | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/UPV/Programas de Apoyo a la I+D+i/PAID-01-17/ES/Ayudas para Contratos de Acceso de personal investigador doctor en estructuras de investigación de la Universitat Politècnica de València 2017- Subprograma 1/ | es_ES |
dc.rights.accessRights | Abierto | es_ES |
dc.contributor.affiliation | Universitat Politècnica de València. Escola Tècnica Superior d'Enginyeria Informàtica | es_ES |
dc.contributor.affiliation | Universitat Politècnica de València. Escuela Politécnica Superior de Alcoy - Escola Politècnica Superior d'Alcoi | es_ES |
dc.description.bibliographicCitation | Baquero-Arnal, P.; Jorge-Cano, J.; Giménez Pastor, A.; Iranzo-Sánchez, J.; Pérez-González De Martos, AM.; Garcés Díaz-Munío, G.; Silvestre Cerdà, JA.... (2022). MLLP-VRAIN Spanish ASR Systems for the Albayzín-RTVE 2020 Speech-to-Text Challenge: Extension. Applied Sciences. 12(2):1-14. https://doi.org/10.3390/app12020804 | es_ES |
dc.description.accrualMethod | S | es_ES |
dc.relation.publisherversion | https://doi.org/10.3390/app12020804 | es_ES |
dc.description.upvformatpinicio | 1 | es_ES |
dc.description.upvformatpfin | 14 | es_ES |
dc.type.version | info:eu-repo/semantics/publishedVersion | es_ES |
dc.description.volume | 12 | es_ES |
dc.description.issue | 2 | es_ES |
dc.identifier.eissn | 2076-3417 | es_ES |
dc.relation.pasarela | S\453406 | es_ES |
dc.contributor.funder | Generalitat Valenciana | es_ES |
dc.contributor.funder | MINISTERIO DE EDUCACION | es_ES |
dc.contributor.funder | AGENCIA ESTATAL DE INVESTIGACION | es_ES |
dc.contributor.funder | European Regional Development Fund | es_ES |
dc.contributor.funder | European Commission | es_ES |
dc.contributor.funder | Universitat Politècnica de València | es_ES |
dc.contributor.funder | MINISTERIO DE CIENCIA INNOVACION Y UNIVERSIDADES | es_ES |
dc.subject.ods | 04.- Garantizar una educación de calidad inclusiva y equitativa, y promover las oportunidades de aprendizaje permanente para todos | es_ES |
dc.subject.ods | 09.- Desarrollar infraestructuras resilientes, promover la industrialización inclusiva y sostenible, y fomentar la innovación | es_ES |
dc.subject.ods | 10.- Reducir las desigualdades entre países y dentro de ellos | es_ES |
upv.costeAPC | 1800 | es_ES |