- -

Comparing Speaker Adaptation Methods for Visual Speech Recognition for Continuous Spanish

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

Comparing Speaker Adaptation Methods for Visual Speech Recognition for Continuous Spanish

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author Gimeno-Gómez, David es_ES
dc.contributor.author Martínez-Hinarejos, Carlos-D. es_ES
dc.date.accessioned 2024-05-23T18:05:55Z
dc.date.available 2024-05-23T18:05:55Z
dc.date.issued 2023-05-26 es_ES
dc.identifier.uri http://hdl.handle.net/10251/204394
dc.description.abstract [EN] Visual speech recognition (VSR) is a challenging task that aims to interpret speech based solely on lip movements. However, although remarkable results have recently been reached in the field, this task remains an open research problem due to different challenges, such as visual ambiguities, the intra-personal variability among speakers, and the complex modeling of silence. Nonetheless, these challenges can be alleviated when the task is approached from a speaker-dependent perspective. Our work focuses on the adaptation of end-to-end VSR systems to a specific speaker. Hence, we propose two different adaptation methods based on the conventional fine-tuning technique or the so-called Adapters. We conduct a comparative study in terms of performance while considering different deployment aspects such as training time and storage cost. Results on the Spanish LIP-RTVE database show that both methods are able to obtain recognition rates comparable to the state of the art, even when only a limited amount of training data is available. Although it incurs a deterioration in performance, the Adapters-based method presents a more scalable and efficient solution, significantly reducing the training time and storage cost by up to 80%. es_ES
dc.description.sponsorship This work was partially supported by the Grant CIACIF/2021/295 funded by Generalitat Valenciana and by the Grant PID2021-124719OB-I00 under the LLEER (PID2021-124719OB-100) project funded by MCIN/AEI/10.13039/501100011033/ and by ERDF EU, A way of making Europe . es_ES
dc.language Inglés es_ES
dc.publisher MDPI AG es_ES
dc.relation.ispartof Applied Sciences es_ES
dc.rights Reconocimiento (by) es_ES
dc.subject Visual speech recognition es_ES
dc.subject Speaker adaptation es_ES
dc.subject Fine-tuning es_ES
dc.subject Adapters es_ES
dc.subject Spanish language es_ES
dc.subject End-to-end architectures es_ES
dc.subject.classification LENGUAJES Y SISTEMAS INFORMATICOS es_ES
dc.title Comparing Speaker Adaptation Methods for Visual Speech Recognition for Continuous Spanish es_ES
dc.type Artículo es_ES
dc.identifier.doi 10.3390/app13116521 es_ES
dc.relation.projectID info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/PID2021-124719OB-I00/ES/LECTURA DE LABIOS EN ESPAÑOL EN ESCENARIOS REALISTAS/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/GENERALITAT VALENCIANA//CIACIF%2F2021%2F295//Contributions to Automatic Lipreading for Spanish/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/FEDER//C22%2FERDF/ es_ES
dc.rights.accessRights Abierto es_ES
dc.contributor.affiliation Universitat Politècnica de València. Escola Tècnica Superior d'Enginyeria Informàtica es_ES
dc.description.bibliographicCitation Gimeno-Gómez, D.; Martínez-Hinarejos, C. (2023). Comparing Speaker Adaptation Methods for Visual Speech Recognition for Continuous Spanish. Applied Sciences. 13(11). https://doi.org/10.3390/app13116521 es_ES
dc.description.accrualMethod S es_ES
dc.relation.publisherversion https://doi.org/10.3390/app13116521 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.description.volume 13 es_ES
dc.description.issue 11 es_ES
dc.identifier.eissn 2076-3417 es_ES
dc.relation.pasarela S\494441 es_ES
dc.contributor.funder GENERALITAT VALENCIANA es_ES
dc.contributor.funder AGENCIA ESTATAL DE INVESTIGACION es_ES
dc.contributor.funder European Regional Development Fund es_ES


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem