- -

Streaming cascade-based speech translation leveraged by a direct segmentation model

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

Streaming cascade-based speech translation leveraged by a direct segmentation model

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author Iranzo-Sánchez, Javier es_ES
dc.contributor.author Jorge-Cano, Javier es_ES
dc.contributor.author Baquero-Arnal, Pau es_ES
dc.contributor.author Silvestre Cerdà, Joan Albert es_ES
dc.contributor.author Giménez Pastor, Adrián es_ES
dc.contributor.author Civera Saiz, Jorge es_ES
dc.contributor.author Sanchis Navarro, José Alberto es_ES
dc.contributor.author Juan, Alfons es_ES
dc.date.accessioned 2022-04-27T06:28:23Z
dc.date.available 2022-04-27T06:28:23Z
dc.date.issued 2021-05-31 es_ES
dc.identifier.issn 0893-6080 es_ES
dc.identifier.uri http://hdl.handle.net/10251/182152
dc.description.abstract [EN] The cascade approach to Speech Translation (ST) is based on a pipeline that concatenates an Automatic Speech Recognition (ASR) system followed by a Machine Translation (MT) system. Nowadays, state-of-the-art ST systems are populated with deep neural networks that are conceived to work in an offline setup in which the audio input to be translated is fully available in advance. However, a streaming setup defines a completely different picture, in which an unbounded audio input gradually becomes available and at the same time the translation needs to be generated under real-time constraints. In this work, we present a state-of-the-art streaming ST system in which neural-based models integrated in the ASR and MT components are carefully adapted in terms of their training and decoding procedures in order to run under a streaming setup. In addition, a direct segmentation model that adapts the continuous ASR output to the capacity of simultaneous MT systems trained at the sentence level is introduced to guarantee low latency while preserving the translation quality of the complete ST system. The resulting ST system is thoroughly evaluated on the real-life streaming Europarl-ST benchmark to gauge the trade-off between quality and latency for each component individually as well as for the complete ST system. es_ES
dc.description.sponsorship The research leading to these results has received funding from the European Union's Horizon 2020 research and innovation program under grant agreement no. 761758 (X5Gon) and 952215 (TAILOR); the Government of Spain's research project Multisub, ref. RTI2018-094879-B-I00 (MCIU/AEI/FEDER,EU) and FPU scholarships FPU14/03981 and FPU18/04135; and the Generalitat Valenciana's research project Classroom Activity Recognition, ref. PROMETEO/2019/111 and predoctoral research scholarship ACIF/2017/055. es_ES
dc.language Inglés es_ES
dc.publisher Elsevier es_ES
dc.relation.ispartof Neural Networks es_ES
dc.rights Reconocimiento - No comercial - Sin obra derivada (by-nc-nd) es_ES
dc.subject Streaming cascade speech translation es_ES
dc.subject Segmentation model es_ES
dc.subject.classification BIBLIOTECONOMIA Y DOCUMENTACION es_ES
dc.subject.classification LENGUAJES Y SISTEMAS INFORMATICOS es_ES
dc.title Streaming cascade-based speech translation leveraged by a direct segmentation model es_ES
dc.type Artículo es_ES
dc.identifier.doi 10.1016/j.neunet.2021.05.013 es_ES
dc.relation.projectID info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/RTI2018-094879-B-I00/ES/SUBTITULACION MULTILINGUE DE CLASES DE AULA Y SESIONES PLENARIAS/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/GENERALITAT VALENCIANA//ACIF%2F2017%2F055//AYUDA PREDOCTORAL CONSELLERIA-BAQUERO ARNAL/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/761758/EU es_ES
dc.relation.projectID info:eu-repo/grantAgreement/ //FPU18%2F04135//AYUDA PREDOCTORAL FPU-IRANZO SANCHEZ. PROYECTO: NOVEL CONTRIBUTIONS TO NEURAL SPEECH TRANSLATION/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/952215/EU es_ES
dc.relation.projectID info:eu-repo/grantAgreement/GENERALITAT VALENCIANA//PROMETEO%2F2019%2F111//CLASSROOM ACTIVITY RECOGNITION/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/MECD//FPU14%2F03981/ES/FPU14%2F03981/ es_ES
dc.rights.accessRights Abierto es_ES
dc.contributor.affiliation Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació es_ES
dc.description.bibliographicCitation Iranzo-Sánchez, J.; Jorge-Cano, J.; Baquero-Arnal, P.; Silvestre Cerdà, JA.; Giménez Pastor, A.; Civera Saiz, J.; Sanchis Navarro, JA.... (2021). Streaming cascade-based speech translation leveraged by a direct segmentation model. Neural Networks. 142:303-315. https://doi.org/10.1016/j.neunet.2021.05.013 es_ES
dc.description.accrualMethod S es_ES
dc.relation.publisherversion https://doi.org/10.1016/j.neunet.2021.05.013 es_ES
dc.description.upvformatpinicio 303 es_ES
dc.description.upvformatpfin 315 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.description.volume 142 es_ES
dc.identifier.pmid 34082286 es_ES
dc.relation.pasarela S\438532 es_ES
dc.contributor.funder GENERALITAT VALENCIANA es_ES
dc.contributor.funder MINISTERIO DE EDUCACION es_ES
dc.contributor.funder AGENCIA ESTATAL DE INVESTIGACION es_ES
dc.contributor.funder European Regional Development Fund es_ES
dc.contributor.funder COMISION DE LAS COMUNIDADES EUROPEA es_ES
dc.contributor.funder MINISTERIO DE CIENCIA INNOVACION Y UNIVERSIDADES es_ES


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem