- -

Speaker-Adapted Confidence Measures for ASR using Deep Bidirectional Recurrent Neural Networks

RiuNet: Institutional repository of the Polithecnic University of Valencia

Share/Send to

Cited by

Statistics

Speaker-Adapted Confidence Measures for ASR using Deep Bidirectional Recurrent Neural Networks

Show simple item record

Files in this item

dc.contributor.author Del Agua Teba, Miguel Angel es_ES
dc.contributor.author Giménez Pastor, Adrián es_ES
dc.contributor.author Sanchis Navarro, José Alberto es_ES
dc.contributor.author Civera Saiz, Jorge es_ES
dc.contributor.author Juan, Alfons es_ES
dc.date.accessioned 2019-05-31T20:43:40Z
dc.date.available 2019-05-31T20:43:40Z
dc.date.issued 2018 es_ES
dc.identifier.issn 2329-9290 es_ES
dc.identifier.uri http://hdl.handle.net/10251/121369
dc.description © 2018 IEEE. Personal use of this material is permitted. Permissíon from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertisíng or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
dc.description.abstract [EN] In the last years, Deep Bidirectional Recurrent Neural Networks (DBRNN) and DBRNN with Long Short-Term Memory cells (DBLSTM) have outperformed the most accurate classifiers for confidence estimation in automatic speech recognition. At the same time, we have recently shown that speaker adaptation of confidence measures using DBLSTM yields significant improvements over non-adapted confidence measures. In accordance with these two recent contributions to the state of the art in confidence estimation, this paper presents a comprehensive study of speaker-adapted confidence measures using DBRNN and DBLSTM models. Firstly, we present new empirical evidences of the superiority of RNN-based confidence classifiers evaluated over a large speech corpus consisting of the English LibriSpeech and the Spanish poliMedia tasks. Secondly, we show new results on speaker-adapted confidence measures considering a multi-task framework in which RNN-based confidence classifiers trained with LibriSpeech are adapted to speakers of the TED-LIUM corpus. These experiments confirm that speaker-adapted confidence measures outperform their non-adapted counterparts. Lastly, we describe an unsupervised adaptation method of the acoustic DBLSTM model based on confidence measures which results in better automatic speech recognition performance. es_ES
dc.description.sponsorship This work was supported in part by the European Union's Horizon 2020 research and innovation programme under Grant 761758 (X5gon), in part by the Seventh Framework Programme (FP7/2007-2013) under Grant 287755 (transLectures), in part by the ICT Policy Support Programme (ICT PSP/2007-2013) as part of the Competitiveness and Innovation Framework Programme under Grant 621030 (EMMA), and in part by the Spanish Government's TIN2015-68326-R (MINECO/FEDER) research project MORE. es_ES
dc.language Inglés es_ES
dc.publisher Institute of Electrical and Electronics Engineers es_ES
dc.relation MINECO-FEDER/TIN2015-68326-R es_ES
dc.relation.ispartof IEEE/ACM Transactions on Audio Speech and Language Processing es_ES
dc.rights Reserva de todos los derechos es_ES
dc.subject Automatic speech recognition es_ES
dc.subject Confidence estimation es_ES
dc.subject Confidence measures es_ES
dc.subject Deep bidirectional recurrent neural networks es_ES
dc.subject Long short-term memory es_ES
dc.subject Speaker adaptation es_ES
dc.subject Speech es_ES
dc.subject Adaptation models es_ES
dc.subject Computer architecture es_ES
dc.subject Training es_ES
dc.subject Recurrent neural networks es_ES
dc.subject Speech processing es_ES
dc.subject Task analysis es_ES
dc.subject.classification BIBLIOTECONOMIA Y DOCUMENTACION es_ES
dc.subject.classification LENGUAJES Y SISTEMAS INFORMATICOS es_ES
dc.title Speaker-Adapted Confidence Measures for ASR using Deep Bidirectional Recurrent Neural Networks es_ES
dc.type Artículo es_ES
dc.identifier.doi 10.1109/TASLP.2018.2819900 es_ES
dc.relation.projectID info:eu-repo/grantAgreement/EC/FP7/287755/EU/Transcription and Translation of Video Lectures/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/761758/EU/X5gon: Cross Modal, Cross Cultural, Cross Lingual, Cross Domain, and Cross Site Global OER Network/ es_ES
dc.rights.accessRights Abierto es_ES
dc.contributor.affiliation Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació es_ES
dc.description.bibliographicCitation Del Agua Teba, MA.; Giménez Pastor, A.; Sanchis Navarro, JA.; Civera Saiz, J.; Juan, A. (2018). Speaker-Adapted Confidence Measures for ASR using Deep Bidirectional Recurrent Neural Networks. IEEE/ACM Transactions on Audio Speech and Language Processing. 26(7):1198-1206. https://doi.org/10.1109/TASLP.2018.2819900 es_ES
dc.description.accrualMethod S es_ES
dc.relation.publisherversion http://doi.org/10.1109/TASLP.2018.2819900 es_ES
dc.description.upvformatpinicio 1198 es_ES
dc.description.upvformatpfin 1206 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.description.volume 26 es_ES
dc.description.issue 7 es_ES
dc.relation.pasarela S\356121 es_ES
dc.contributor.funder Ministerio de Economía y Empresa es_ES
dc.contributor.funder European Commission es_ES


This item appears in the following Collection(s)

Show simple item record