Development and Evaluation of an Automatic Speech Recognition System Adapted to the Transcription of Classroom Video Recordings

Roselló Beneitez, Nahuel Unai

RiuNet repositorio UPV
:
Docencia
:
Trabajos académicos
:
Servicio de alumnado - Trabajos académicos
:
Ver ítem

Identificarse

Buscar en RiuNet

Listar

Todo RiuNet
Esta colección

Mi cuenta

Acceder

Estadísticas

Ver Estadísticas de uso

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Development and Evaluation of an Automatic Speech Recognition System Adapted to the Transcription of Classroom Video Recordings

Mostrar el registro sencillo del ítem

Ficheros en el ítem

Nombre: Rosello - Desarrollo ...

Tamaño: 1.026Mb

Formato: PDF

Abrir

dc.contributor.advisor	Sanchis Navarro, José Alberto	es_ES
dc.contributor.advisor	Giménez Pastor, Adrián	es_ES
dc.contributor.author	Roselló Beneitez, Nahuel Unai	es_ES
dc.date.accessioned	2021-10-14T08:06:53Z
dc.date.available	2021-10-14T08:06:53Z
dc.date.created	2021-09-22
dc.date.issued	2021-10-14	es_ES
dc.identifier.uri	http://hdl.handle.net/10251/174670
dc.description.abstract	[ES] El Reconocimiento Automático del Habla (RAH) ha demostrado ser una manera efectiva y eficiente de convertir habla a texto a lo largo de los últimos años. Este trabajo, desarrollado en el contexto de dos proyectos apoyados por el Gobierno de España y la Generalitat Valenciana, explora el uso del RAH en el contexto de grabaciones de clases de aula. Con este fin, se explota un conjunto de datos con más de 1400 horas de grabaciones de clases. Este conjunto se compone de dos fuentes de datos (micrófonos de solapa y cámara) que graban una clase determinada al mismo tiempo, aunque una de las fuentes tiene peor calidad que la otra. A lo largo de esta memoria, se describen algunos de los problemas que se han dado en los proyectos, como el hecho de que inicialmente el conjunto de datos no viene dado con ninguna transcripción, o que ambas fuentes de datos no estaban perfectamente sincronizadas en algunos casos. Este trabajo también presenta experimentos llevados a cabo con la fuente de datos de mejor calidad, y replicados con ambas fuentes de audio con el fin de comparar las dos aproximaciones. Además, se reentrena un sistema ya existente con ambas fuentes de audio. El sistema resultante, previamente entrenado con casi 4000 horas de audio, se compara con el resto de sistemas desarrollados. Finalmente, este trabajo expone algunas conclusiones extraídas de los experimentos anteriormente mencionados.	es_ES
dc.description.abstract	[EN] Automatic Speech Recognition (ASR) has proven to be an efficient and effective way of converting speech to text over the last years. This work, performed in the context of two projects from the Government of Spain and the Generalitat Valenciana, explores the usage of ASR in the context of classroom video recordings. In order to do this, a dataset consisting of more than 1400 hours of classroom recordings is exploited. The dataset is divided into two sources (clip-on and camera microphones) which record a given class at the same time, even though one of them is noisier than the other. Several obstacles faced in the work carried out are described, such as the fact that the transcriptions of the recordings were not initially included in the dataset, or the fact that both sources of audio were not perfectly synchronized in some recordings. This work also presents experiments performed with the cleaner source of audio and replicated with both sources of audio so as to compare both approaches. Moreover, a baseline system trained with nearly 4000 hours is retrained with both sources of audio and the resulting system is compared to the rest of the developed systems. Finally, this work ends with some conclusions extracted from the previously mentioned experiments.	es_ES
dc.format.extent	76	es_ES
dc.language	Inglés	es_ES
dc.publisher	Universitat Politècnica de València	es_ES
dc.rights	Reconocimiento (by)	es_ES
dc.subject	Reconocimiento automático del habla	es_ES
dc.subject	Redes neuronales	es_ES
dc.subject	Grabaciones de clases de aula	es_ES
dc.subject	Adaptación de sistemas	es_ES
dc.subject	Automatic speech recognition	es_ES
dc.subject	Neural networks	es_ES
dc.subject	Classroom video recordings	es_ES
dc.subject	System adaptation	es_ES
dc.subject.classification	LENGUAJES Y SISTEMAS INFORMATICOS	es_ES
dc.subject.other	Máster Universitario en Inteligencia Artificial, Reconocimiento de Formas e Imagen Digital-Màster Universitari en Intel·Ligència Artificial: Reconeixement de Formes i Imatge Digital	es_ES
dc.title	Development and Evaluation of an Automatic Speech Recognition System Adapted to the Transcription of Classroom Video Recordings	es_ES
dc.type	Tesis de máster	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/Generalitat Valenciana//PROMETEO%2F2019%2F111/ES/	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/RTI2018-094879-B-I00/ES/SUBTITULACION MULTILINGUE DE CLASES DE AULA Y SESIONES PLENARIAS/	es_ES
dc.rights.accessRights	Abierto	es_ES
dc.contributor.affiliation	Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació	es_ES
dc.description.bibliographicCitation	Roselló Beneitez, NU. (2021). Development and Evaluation of an Automatic Speech Recognition System Adapted to the Transcription of Classroom Video Recordings. Universitat Politècnica de València. http://hdl.handle.net/10251/174670	es_ES
dc.description.accrualMethod	TFGM	es_ES
dc.relation.pasarela	TFGM\141786	es_ES
dc.contributor.funder	Universitat Politècnica de València	es_ES
dc.contributor.funder	Generalitat Valenciana	es_ES

Este ítem aparece en la(s) siguiente(s) colección(ones)

Servicio de alumnado - Trabajos académicos [7391]

Mostrar el registro sencillo del ítem

Development and Evaluation of an Automatic Speech Recognition System Adapted to the Transcription of Classroom Video Recordings

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Buscar en RiuNet

Listar

Todo RiuNet

Esta colección

Mi cuenta

Estadísticas

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Development and Evaluation of an Automatic Speech Recognition System Adapted to the Transcription of Classroom Video Recordings

Ficheros en el ítem

Este ítem aparece en la(s) siguiente(s) colección(ones)