- -

Improved Hybrid Streaming ASR with Transformer Language Models

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

Improved Hybrid Streaming ASR with Transformer Language Models

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author Baquero-Arnal, Pau es_ES
dc.contributor.author Jorge-Cano, Javier es_ES
dc.contributor.author Giménez Pastor, Adrián es_ES
dc.contributor.author Silvestre Cerdà, Joan Albert es_ES
dc.contributor.author Iranzo-Sánchez, Javier es_ES
dc.contributor.author Sanchis Navarro, José Alberto es_ES
dc.contributor.author Civera Saiz, Jorge es_ES
dc.contributor.author Juan, Alfons es_ES
dc.date.accessioned 2025-01-21T15:28:11Z
dc.date.available 2025-01-21T15:28:11Z
dc.date.issued 2020-10-29 es_ES
dc.identifier.issn 1990-9772 es_ES
dc.identifier.uri http://hdl.handle.net/10251/213952
dc.description.abstract [EN] Streaming ASR is gaining momentum due to its wide applicability, though it is still unclear how best to come close to the accuracy of state-of-the-art off-line ASR systems when the output must come within a short delay after the incoming audio stream. Following our previous work on streaming one-pass decoding with hybrid ASR systems and LSTM language models, in this work we report further improvements by replacing LSTMs with Transformer models. First, two key ideas are discussed so as to run these models fast during inference. Then, empirical results on LibriSpeech and TED-LIUM are provided showing that Transformer language models lead to improved recognition rates on both tasks. ASR systems obtained in this work can be seamlessly transfered to a streaming setup with minimal quality losses. Indeed, to the best of our knowledge, no better results have been reported on these tasks when assessed under a streaming setup. es_ES
dc.description.sponsorship The research leading to these results has received funding from the European Union s Horizon 2020 research and innovation program under grant agreement no. 761758 (X5Gon); the Government of Spain s research project Multisub, ref. RTI2018-094879-B-I00 (MCIU/AEI/FEDER,EU); and the Generalitat Valenciana predoctoral research scholarship ACIF/2017/055 es_ES
dc.language Inglés es_ES
dc.rights Reserva de todos los derechos es_ES
dc.subject Streaming es_ES
dc.subject Hybrid ASR es_ES
dc.subject Language models es_ES
dc.subject Transformer es_ES
dc.subject.classification LENGUAJES Y SISTEMAS INFORMATICOS es_ES
dc.title Improved Hybrid Streaming ASR with Transformer Language Models es_ES
dc.type Comunicación en congreso es_ES
dc.type Artículo es_ES
dc.identifier.doi 10.21437/Interspeech.2020 es_ES
dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/761758/EU/X5gon: Cross Modal, Cross Cultural, Cross Lingual, Cross Domain, and Cross Site Global OER Network/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/GVA//ACIF%2F2017%2F055/ES/Subvenciones para la contratación de personal investigador de carácter predoctoral es_ES
dc.relation.projectID info:eu-repo/grantAgreement/AEI//RTI2018-094879-B-I00-AR/ES/SUBTITULACIÓN MULTILINGÜE DE CLASES DE AULA Y SESIONES PLENARIAS/ es_ES
dc.rights.accessRights Abierto es_ES
dc.contributor.affiliation Universitat Politècnica de València. Escola Tècnica Superior d'Enginyeria Informàtica es_ES
dc.contributor.affiliation Universitat Politècnica de València. Escuela Politécnica Superior de Alcoy - Escola Politècnica Superior d'Alcoi es_ES
dc.description.bibliographicCitation Baquero-Arnal, P.; Jorge-Cano, J.; Giménez Pastor, A.; Silvestre Cerdà, JA.; Iranzo-Sánchez, J.; Sanchis Navarro, JA.; Civera Saiz, J.... (2020). Improved Hybrid Streaming ASR with Transformer Language Models. 2127-2131. https://doi.org/10.21437/Interspeech.2020 es_ES
dc.description.accrualMethod S es_ES
dc.relation.conferencename 21st Annual Conference of the International Speech Communication Association (INTERSPEECH 2020) es_ES
dc.relation.conferencedate Octubre 25-29,2020 es_ES
dc.relation.conferenceplace Online es_ES
dc.relation.publisherversion https://doi.org/10.21437/Interspeech.2020 es_ES
dc.description.upvformatpinicio 2127 es_ES
dc.description.upvformatpfin 2131 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.relation.pasarela S\422454 es_ES
dc.contributor.funder European Commission es_ES
dc.contributor.funder Generalitat Valenciana


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem