- -

Towards cross-lingual voice cloning in higher education

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

Towards cross-lingual voice cloning in higher education

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author Pérez-González de Martos, Alejandro Manuel es_ES
dc.contributor.author Garcés Díaz-Munío, Gonçal es_ES
dc.contributor.author Giménez Pastor, Adrián es_ES
dc.contributor.author Silvestre Cerdà, Joan Albert es_ES
dc.contributor.author Sanchis Navarro, José Alberto es_ES
dc.contributor.author Civera Saiz, Jorge es_ES
dc.contributor.author Jiménez, Manuel es_ES
dc.contributor.author Turró Ribalta, Carlos es_ES
dc.contributor.author Juan, Alfons es_ES
dc.date.accessioned 2022-06-27T18:06:49Z
dc.date.available 2022-06-27T18:06:49Z
dc.date.issued 2021-10 es_ES
dc.identifier.issn 0952-1976 es_ES
dc.identifier.uri http://hdl.handle.net/10251/183624
dc.description.abstract [EN] The rapid progress of modern AI tools for automatic speech recognition and machine translation is leading to a progressive cost reduction to produce publishable subtitles for educational videos in multiple languages. Similarly, text-to-speech technology is experiencing large improvements in terms of quality, flexibility and capabilities. In particular, state-of-the-art systems are now capable of seamlessly dealing with multiple languages and speakers in an integrated manner, thus enabling lecturer¿s voice cloning in languages she/he might not even speak. This work is to report the experience gained on using such systems at the Universitat Politècnica de València (UPV), mainly as a guidance for other educational organizations willing to conduct similar studies. It builds on previous work on the UPV¿s main repository of educational videos, MediaUPV, to produce multilingual subtitles at scale and low cost. Here, a detailed account is given on how this work has been extended to also allow for massive machine dubbing of MediaUPV. This includes collecting 59 h of clean speech data from UPV¿s academic staff, and extending our production pipeline of subtitles with a state-of-the-art multilingual and multi-speaker text-to-speech system trained from the collected data. Our main result comes from an extensive, subjective evaluation of this system by lecturers contributing to data collection. In brief, it is shown that text-to-speech technology is not only mature enough for its application to MediaUPV, but also needed as soon as possible by students to improve its accessibility and bridge language barriers. es_ES
dc.description.sponsorship We wish first to thank all UPV lecturers who made this study possi-ble. We are also very grateful for the funding support received by the European Union's Horizon 2020 research and innovation programme under grant agreement no. 761758 (X5gon) , the Spanish government under grant RTI2018-094879-B-I00 (Multisub, MCIU/AEI/FEDER) , and the Universitat Politecnica de Valencia's, Spain PAID-01-17 R&D sup-port programme. Funding for open access charge: CRUE-Universitat Politecnica de Valencia es_ES
dc.language Inglés es_ES
dc.publisher Elsevier es_ES
dc.relation.ispartof Engineering Applications of Artificial Intelligence es_ES
dc.rights Reconocimiento (by) es_ES
dc.subject Text-to-speech es_ES
dc.subject Multilinguality es_ES
dc.subject Cross-lingual voice conversion es_ES
dc.subject Educational resources es_ES
dc.subject OER es_ES
dc.subject.classification BIBLIOTECONOMIA Y DOCUMENTACION es_ES
dc.subject.classification LENGUAJES Y SISTEMAS INFORMATICOS es_ES
dc.title Towards cross-lingual voice cloning in higher education es_ES
dc.type Artículo es_ES
dc.identifier.doi 10.1016/j.engappai.2021.104413 es_ES
dc.relation.projectID info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/RTI2018-094879-B-I00/ES/SUBTITULACION MULTILINGUE DE CLASES DE AULA Y SESIONES PLENARIAS/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/UPV//PAID-01-17//Contratos Pre-Doctorales UPV 2017- Subprograma 1/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/761758/EU es_ES
dc.rights.accessRights Abierto es_ES
dc.contributor.affiliation Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació es_ES
dc.description.bibliographicCitation Pérez-González De Martos, AM.; Garcés Díaz-Munío, G.; Giménez Pastor, A.; Silvestre Cerdà, JA.; Sanchis Navarro, JA.; Civera Saiz, J.; Jiménez, M.... (2021). Towards cross-lingual voice cloning in higher education. Engineering Applications of Artificial Intelligence. 105:1-9. https://doi.org/10.1016/j.engappai.2021.104413 es_ES
dc.description.accrualMethod S es_ES
dc.relation.publisherversion https://doi.org/10.1016/j.engappai.2021.104413 es_ES
dc.description.upvformatpinicio 1 es_ES
dc.description.upvformatpfin 9 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.description.volume 105 es_ES
dc.relation.pasarela S\444343 es_ES
dc.contributor.funder AGENCIA ESTATAL DE INVESTIGACION es_ES
dc.contributor.funder European Regional Development Fund es_ES
dc.contributor.funder COMISION DE LAS COMUNIDADES EUROPEA es_ES
dc.contributor.funder Universitat Politècnica de València es_ES
dc.subject.ods 04.- Garantizar una educación de calidad inclusiva y equitativa, y promover las oportunidades de aprendizaje permanente para todos es_ES
dc.subject.ods 10.- Reducir las desigualdades entre países y dentro de ellos es_ES
upv.costeAPC 2773,8 es_ES


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem