Improved Hybrid Streaming ASR with Transformer Language Models

Baquero-Arnal, Pau; Jorge-Cano, Javier; Giménez Pastor, Adrián; Silvestre Cerdà, Joan Albert; Iranzo-Sánchez, Javier; Sanchis Navarro, José Alberto; Civera Saiz, Jorge; Juan, Alfons

doi:10.21437/Interspeech.2020

Identificarse

Buscar en RiuNet

Listar

Todo RiuNet
Esta colección

Mi cuenta

Acceder

Estadísticas

Ver Estadísticas de uso

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Improved Hybrid Streaming ASR with Transformer Language Models

Mostrar el registro sencillo del ítem

Ficheros en el ítem

Nombre: Baquero-ArnalJorg ...

Tamaño: 244.7Kb

Formato: PDF

Descripción: Versión del Autor.

Abrir

Nombre: 2770.pdf

Tamaño: 204.2Kb

Formato: PDF

Descripción: Versión editorial

Solicitar una copia al autor

dc.contributor.author	Baquero-Arnal, Pau	es_ES
dc.contributor.author	Jorge-Cano, Javier	es_ES
dc.contributor.author	Giménez Pastor, Adrián	es_ES
dc.contributor.author	Silvestre Cerdà, Joan Albert	es_ES
dc.contributor.author	Iranzo-Sánchez, Javier	es_ES
dc.contributor.author	Sanchis Navarro, José Alberto	es_ES
dc.contributor.author	Civera Saiz, Jorge	es_ES
dc.contributor.author	Juan, Alfons	es_ES
dc.date.accessioned	2025-01-21T15:28:11Z
dc.date.available	2025-01-21T15:28:11Z
dc.date.issued	2020-10-29	es_ES
dc.identifier.issn	1990-9772	es_ES
dc.identifier.uri	http://hdl.handle.net/10251/213952
dc.description.abstract	[EN] Streaming ASR is gaining momentum due to its wide applicability, though it is still unclear how best to come close to the accuracy of state-of-the-art off-line ASR systems when the output must come within a short delay after the incoming audio stream. Following our previous work on streaming one-pass decoding with hybrid ASR systems and LSTM language models, in this work we report further improvements by replacing LSTMs with Transformer models. First, two key ideas are discussed so as to run these models fast during inference. Then, empirical results on LibriSpeech and TED-LIUM are provided showing that Transformer language models lead to improved recognition rates on both tasks. ASR systems obtained in this work can be seamlessly transfered to a streaming setup with minimal quality losses. Indeed, to the best of our knowledge, no better results have been reported on these tasks when assessed under a streaming setup.	es_ES
dc.description.sponsorship	The research leading to these results has received funding from the European Union s Horizon 2020 research and innovation program under grant agreement no. 761758 (X5Gon); the Government of Spain s research project Multisub, ref. RTI2018-094879-B-I00 (MCIU/AEI/FEDER,EU); and the Generalitat Valenciana predoctoral research scholarship ACIF/2017/055	es_ES
dc.language	Inglés	es_ES
dc.rights	Reserva de todos los derechos	es_ES
dc.subject	Streaming	es_ES
dc.subject	Hybrid ASR	es_ES
dc.subject	Language models	es_ES
dc.subject	Transformer	es_ES
dc.subject.classification	LENGUAJES Y SISTEMAS INFORMATICOS	es_ES
dc.title	Improved Hybrid Streaming ASR with Transformer Language Models	es_ES
dc.type	Comunicación en congreso	es_ES
dc.type	Artículo	es_ES
dc.identifier.doi	10.21437/Interspeech.2020	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/EC/H2020/761758/EU/X5gon: Cross Modal, Cross Cultural, Cross Lingual, Cross Domain, and Cross Site Global OER Network/	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/GVA//ACIF%2F2017%2F055/ES/Subvenciones para la contratación de personal investigador de carácter predoctoral	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/AEI//RTI2018-094879-B-I00-AR/ES/SUBTITULACIÓN MULTILINGÜE DE CLASES DE AULA Y SESIONES PLENARIAS/	es_ES
dc.rights.accessRights	Abierto	es_ES
dc.contributor.affiliation	Universitat Politècnica de València. Escola Tècnica Superior d'Enginyeria Informàtica	es_ES
dc.contributor.affiliation	Universitat Politècnica de València. Escuela Politécnica Superior de Alcoy - Escola Politècnica Superior d'Alcoi	es_ES
dc.description.bibliographicCitation	Baquero-Arnal, P.; Jorge-Cano, J.; Giménez Pastor, A.; Silvestre Cerdà, JA.; Iranzo-Sánchez, J.; Sanchis Navarro, JA.; Civera Saiz, J.... (2020). Improved Hybrid Streaming ASR with Transformer Language Models. 2127-2131. https://doi.org/10.21437/Interspeech.2020	es_ES
dc.description.accrualMethod	S	es_ES
dc.relation.conferencename	21st Annual Conference of the International Speech Communication Association (INTERSPEECH 2020)	es_ES
dc.relation.conferencedate	Octubre 25-29,2020	es_ES
dc.relation.conferenceplace	Online	es_ES
dc.relation.publisherversion	https://doi.org/10.21437/Interspeech.2020	es_ES
dc.description.upvformatpinicio	2127	es_ES
dc.description.upvformatpfin	2131	es_ES
dc.type.version	info:eu-repo/semantics/publishedVersion	es_ES
dc.relation.pasarela	S\422454	es_ES
dc.contributor.funder	European Commission	es_ES
dc.contributor.funder	Generalitat Valenciana

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem

Improved Hybrid Streaming ASR with Transformer Language Models

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Buscar en RiuNet

Listar

Todo RiuNet

Esta colección

Mi cuenta

Estadísticas

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Improved Hybrid Streaming ASR with Transformer Language Models

Ficheros en el ítem

Este ítem aparece en la(s) siguiente(s) colección(ones)