Pérez-González De Martos, AM.; Garcés Díaz-Munío, G.; Giménez Pastor, A.; Silvestre Cerdà, JA.; Sanchis Navarro, JA.; Civera Saiz, J.; Jiménez, M.... (2021). Towards cross-lingual voice cloning in higher education. Engineering Applications of Artificial Intelligence. 105:1-9. https://doi.org/10.1016/j.engappai.2021.104413
Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10251/183624
Título:
|
Towards cross-lingual voice cloning in higher education
|
Autor:
|
Pérez-González de Martos, Alejandro Manuel
Garcés Díaz-Munío, Gonçal
Giménez Pastor, Adrián
Silvestre Cerdà, Joan Albert
Sanchis Navarro, José Alberto
Civera Saiz, Jorge
Jiménez, Manuel
Turró Ribalta, Carlos
Juan, Alfons
|
Entidad UPV:
|
Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació
|
Fecha difusión:
|
|
Resumen:
|
[EN] The rapid progress of modern AI tools for automatic speech recognition and machine translation is leading to a progressive cost reduction to produce publishable subtitles for educational videos in multiple languages. ...[+]
[EN] The rapid progress of modern AI tools for automatic speech recognition and machine translation is leading to a progressive cost reduction to produce publishable subtitles for educational videos in multiple languages. Similarly, text-to-speech technology is experiencing large improvements in terms of quality, flexibility and capabilities. In particular, state-of-the-art systems are now capable of seamlessly dealing with multiple languages and speakers in an integrated manner, thus enabling lecturer¿s voice cloning in languages she/he might not even speak. This work is to report the experience gained on using such systems at the Universitat Politècnica de València (UPV), mainly as a guidance for other educational organizations willing to conduct similar studies. It builds on previous work on the UPV¿s main repository of educational videos, MediaUPV, to produce multilingual subtitles at scale and low cost. Here, a detailed account is given on how this work has been extended to also allow for massive machine dubbing of MediaUPV. This includes collecting 59 h of clean speech data from UPV¿s academic staff, and extending our production pipeline of subtitles with a state-of-the-art multilingual and multi-speaker text-to-speech system trained from the collected data. Our main result comes from an extensive, subjective evaluation of this system by lecturers contributing to data collection. In brief, it is shown that text-to-speech technology is not only mature enough for its application to MediaUPV, but also needed as soon as possible by students to improve its accessibility and bridge language barriers.
[-]
|
Palabras clave:
|
Text-to-speech
,
Multilinguality
,
Cross-lingual voice conversion
,
Educational resources
,
OER
|
Derechos de uso:
|
Reconocimiento (by)
|
Fuente:
|
Engineering Applications of Artificial Intelligence. (issn:
0952-1976
)
|
DOI:
|
10.1016/j.engappai.2021.104413
|
Editorial:
|
Elsevier
|
Versión del editor:
|
https://doi.org/10.1016/j.engappai.2021.104413
|
Coste APC:
|
2773,8 €
|
Código del Proyecto:
|
info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/RTI2018-094879-B-I00/ES/SUBTITULACION MULTILINGUE DE CLASES DE AULA Y SESIONES PLENARIAS/
info:eu-repo/grantAgreement/UPV//PAID-01-17//Contratos Pre-Doctorales UPV 2017- Subprograma 1/
info:eu-repo/grantAgreement/EC/H2020/761758/EU
|
Agradecimientos:
|
We wish first to thank all UPV lecturers who made this study possi-ble. We are also very grateful for the funding support received by the European Union's Horizon 2020 research and innovation programme under grant agreement ...[+]
We wish first to thank all UPV lecturers who made this study possi-ble. We are also very grateful for the funding support received by the European Union's Horizon 2020 research and innovation programme under grant agreement no. 761758 (X5gon) , the Spanish government under grant RTI2018-094879-B-I00 (Multisub, MCIU/AEI/FEDER) , and the Universitat Politecnica de Valencia's, Spain PAID-01-17 R&D sup-port programme. Funding for open access charge: CRUE-Universitat Politecnica de Valencia
[-]
|
Tipo:
|
Artículo
|