Continuous lipreading based on acoustic temporal alignments

Gimeno-Gómez, David; Martínez-Hinarejos, Carlos-D.

doi:10.1186/s13636-024-00345-7

Identificarse

Buscar en RiuNet

Listar

Todo RiuNet
Esta colección

Mi cuenta

Acceder

Estadísticas

Ver Estadísticas de uso

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Continuous lipreading based on acoustic temporal alignments

Mostrar el registro sencillo del ítem

Ficheros en el ítem

Nombre: Gimeno-GomezMarti ...

Tamaño: 1.673Mb

Formato: PDF

Descripción: Versión editorial

Abrir

dc.contributor.author	Gimeno-Gómez, David	es_ES
dc.contributor.author	Martínez-Hinarejos, Carlos-D.	es_ES
dc.date.accessioned	2024-11-14T19:13:46Z
dc.date.available	2024-11-14T19:13:46Z
dc.date.issued	2024-05-06	es_ES
dc.identifier.issn	1687-4722	es_ES
dc.identifier.uri	http://hdl.handle.net/10251/211808
dc.description.abstract	[EN] Visual speech recognition (VSR) is a challenging task that has received increasing interest during the last few decades. Current state of the art employs powerful end-to-end architectures based on deep learning which depend on large amounts of data and high computational resources for their estimation. We address the task of VSR for data scarcity scenarios with limited computational resources by using traditional approaches based on hidden Markov models. We present a novel learning strategy that employs information obtained from previous acoustic temporal alignments to improve the visual system performance. Furthermore, we studied multiple visual speech representations and how image resolution or frame rate affect its performance. All these experiments were conducted on the limited data VLRF corpus, a database which offers an audio-visual support to address continuous speech recognition in Spanish. The results show that our approach significantly outperforms the best results achieved on the task to date.	es_ES
dc.description.sponsorship	This work was partially supported by Grant CIACIF/2021/295 funded by Generalitat Valenciana and by Grant PID2021-124719OB-I00 under project LLEER (PID2021-124719OB-100) funded by MCIN/AEI/10.13039/501100011033/ and by ERDF, EU A way of making Europe.	es_ES
dc.language	Inglés	es_ES
dc.publisher	Springer (Biomed Central Ltd.)	es_ES
dc.relation.ispartof	EURASIP Journal on Audio, Speech and Music Processing	es_ES
dc.rights	Reconocimiento (by)	es_ES
dc.subject	Visual speech recognition	es_ES
dc.subject	Limited computation	es_ES
dc.subject	Data scarcity	es_ES
dc.subject	Speech processing	es_ES
dc.subject	Computer vision	es_ES
dc.subject.classification	LENGUAJES Y SISTEMAS INFORMATICOS	es_ES
dc.title	Continuous lipreading based on acoustic temporal alignments	es_ES
dc.type	Artículo	es_ES
dc.identifier.doi	10.1186/s13636-024-00345-7	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/PID2021-124719OB-I00/ES/LECTURA DE LABIOS EN ESPAÑOL EN ESCENARIOS REALISTAS/	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/GENERALITAT VALENCIANA//CIACIF%2F2021%2F295//Contributions to Automatic Lipreading for Spanish/	es_ES
dc.rights.accessRights	Abierto	es_ES
dc.contributor.affiliation	Universitat Politècnica de València. Escola Tècnica Superior d'Enginyeria Informàtica	es_ES
dc.description.bibliographicCitation	Gimeno-Gómez, D.; Martínez-Hinarejos, C. (2024). Continuous lipreading based on acoustic temporal alignments. EURASIP Journal on Audio, Speech and Music Processing. 2024(1). https://doi.org/10.1186/s13636-024-00345-7	es_ES
dc.description.accrualMethod	S	es_ES
dc.relation.publisherversion	https://doi.org/10.1186/s13636-024-00345-7	es_ES
dc.type.version	info:eu-repo/semantics/publishedVersion	es_ES
dc.description.volume	2024	es_ES
dc.description.issue	1	es_ES
dc.relation.pasarela	S\517636	es_ES
dc.contributor.funder	GENERALITAT VALENCIANA	es_ES
dc.contributor.funder	AGENCIA ESTATAL DE INVESTIGACION	es_ES
dc.contributor.funder	Universitat Politècnica de València	es_ES
upv.costeAPC	1900	es_ES

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem

Continuous lipreading based on acoustic temporal alignments

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Buscar en RiuNet

Listar

Todo RiuNet

Esta colección

Mi cuenta

Estadísticas

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Continuous lipreading based on acoustic temporal alignments

Ficheros en el ítem

Este ítem aparece en la(s) siguiente(s) colección(ones)