Mostrar el registro sencillo del ítem
dc.contributor.author | Iranzo-Sánchez, Javier | es_ES |
dc.contributor.author | Jorge-Cano, Javier | es_ES |
dc.contributor.author | Baquero-Arnal, Pau | es_ES |
dc.contributor.author | Silvestre Cerdà, Joan Albert | es_ES |
dc.contributor.author | Giménez Pastor, Adrián | es_ES |
dc.contributor.author | Civera Saiz, Jorge | es_ES |
dc.contributor.author | Sanchis Navarro, José Alberto | es_ES |
dc.contributor.author | Juan, Alfons | es_ES |
dc.date.accessioned | 2022-04-27T06:28:23Z | |
dc.date.available | 2022-04-27T06:28:23Z | |
dc.date.issued | 2021-05-31 | es_ES |
dc.identifier.issn | 0893-6080 | es_ES |
dc.identifier.uri | http://hdl.handle.net/10251/182152 | |
dc.description.abstract | [EN] The cascade approach to Speech Translation (ST) is based on a pipeline that concatenates an Automatic Speech Recognition (ASR) system followed by a Machine Translation (MT) system. Nowadays, state-of-the-art ST systems are populated with deep neural networks that are conceived to work in an offline setup in which the audio input to be translated is fully available in advance. However, a streaming setup defines a completely different picture, in which an unbounded audio input gradually becomes available and at the same time the translation needs to be generated under real-time constraints. In this work, we present a state-of-the-art streaming ST system in which neural-based models integrated in the ASR and MT components are carefully adapted in terms of their training and decoding procedures in order to run under a streaming setup. In addition, a direct segmentation model that adapts the continuous ASR output to the capacity of simultaneous MT systems trained at the sentence level is introduced to guarantee low latency while preserving the translation quality of the complete ST system. The resulting ST system is thoroughly evaluated on the real-life streaming Europarl-ST benchmark to gauge the trade-off between quality and latency for each component individually as well as for the complete ST system. | es_ES |
dc.description.sponsorship | The research leading to these results has received funding from the European Union's Horizon 2020 research and innovation program under grant agreement no. 761758 (X5Gon) and 952215 (TAILOR); the Government of Spain's research project Multisub, ref. RTI2018-094879-B-I00 (MCIU/AEI/FEDER,EU) and FPU scholarships FPU14/03981 and FPU18/04135; and the Generalitat Valenciana's research project Classroom Activity Recognition, ref. PROMETEO/2019/111 and predoctoral research scholarship ACIF/2017/055. | es_ES |
dc.language | Inglés | es_ES |
dc.publisher | Elsevier | es_ES |
dc.relation.ispartof | Neural Networks | es_ES |
dc.rights | Reconocimiento - No comercial - Sin obra derivada (by-nc-nd) | es_ES |
dc.subject | Streaming cascade speech translation | es_ES |
dc.subject | Segmentation model | es_ES |
dc.subject.classification | BIBLIOTECONOMIA Y DOCUMENTACION | es_ES |
dc.subject.classification | LENGUAJES Y SISTEMAS INFORMATICOS | es_ES |
dc.title | Streaming cascade-based speech translation leveraged by a direct segmentation model | es_ES |
dc.type | Artículo | es_ES |
dc.identifier.doi | 10.1016/j.neunet.2021.05.013 | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/RTI2018-094879-B-I00/ES/SUBTITULACION MULTILINGUE DE CLASES DE AULA Y SESIONES PLENARIAS/ | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/GENERALITAT VALENCIANA//ACIF%2F2017%2F055//AYUDA PREDOCTORAL CONSELLERIA-BAQUERO ARNAL/ | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/EC/H2020/761758/EU | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/ //FPU18%2F04135//AYUDA PREDOCTORAL FPU-IRANZO SANCHEZ. PROYECTO: NOVEL CONTRIBUTIONS TO NEURAL SPEECH TRANSLATION/ | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/EC/H2020/952215/EU | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/GENERALITAT VALENCIANA//PROMETEO%2F2019%2F111//CLASSROOM ACTIVITY RECOGNITION/ | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/MECD//FPU14%2F03981/ES/FPU14%2F03981/ | es_ES |
dc.rights.accessRights | Abierto | es_ES |
dc.contributor.affiliation | Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació | es_ES |
dc.description.bibliographicCitation | Iranzo-Sánchez, J.; Jorge-Cano, J.; Baquero-Arnal, P.; Silvestre Cerdà, JA.; Giménez Pastor, A.; Civera Saiz, J.; Sanchis Navarro, JA.... (2021). Streaming cascade-based speech translation leveraged by a direct segmentation model. Neural Networks. 142:303-315. https://doi.org/10.1016/j.neunet.2021.05.013 | es_ES |
dc.description.accrualMethod | S | es_ES |
dc.relation.publisherversion | https://doi.org/10.1016/j.neunet.2021.05.013 | es_ES |
dc.description.upvformatpinicio | 303 | es_ES |
dc.description.upvformatpfin | 315 | es_ES |
dc.type.version | info:eu-repo/semantics/publishedVersion | es_ES |
dc.description.volume | 142 | es_ES |
dc.identifier.pmid | 34082286 | es_ES |
dc.relation.pasarela | S\438532 | es_ES |
dc.contributor.funder | GENERALITAT VALENCIANA | es_ES |
dc.contributor.funder | MINISTERIO DE EDUCACION | es_ES |
dc.contributor.funder | AGENCIA ESTATAL DE INVESTIGACION | es_ES |
dc.contributor.funder | European Regional Development Fund | es_ES |
dc.contributor.funder | COMISION DE LAS COMUNIDADES EUROPEA | es_ES |
dc.contributor.funder | MINISTERIO DE CIENCIA INNOVACION Y UNIVERSIDADES | es_ES |