Iranzo-Sánchez, J.; Silvestre Cerdà, JA.; Jorge-Cano, J.; Roselló, N.; Giménez, A.; Sanchis Navarro, JA.; Civera Saiz, J.... (2020). Europarl-ST: A Multilingual Corpus for Speech Translation of Parliamentary Debates. IEEE. 8229-8233. https://doi.org/10.1109/ICASSP40776.2020.9054626
Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10251/202368
Título:
|
Europarl-ST: A Multilingual Corpus for Speech Translation of Parliamentary Debates
|
Autor:
|
Iranzo-Sánchez, Javier
Silvestre Cerdà, Joan Albert
Jorge-Cano, Javier
Roselló, Nahuel
Giménez, Adriá
Sanchis Navarro, José Alberto
Civera Saiz, Jorge
Juan, Alfons
|
Entidad UPV:
|
Universitat Politècnica de València. Escola Tècnica Superior d'Enginyeria Informàtica
Universitat Politècnica de València. Instituto Universitario Valenciano de Investigación en Inteligencia Artificial - Institut Universitari Valencià de Recerca en Intel·ligència Artificial
Universitat Politècnica de València. Escuela Politécnica Superior de Alcoy - Escola Politècnica Superior d'Alcoi
|
Fecha difusión:
|
|
Resumen:
|
[EN] Current research into spoken language translation (SLT), or speech-to-text translation, is often hampered by the lack of specific data resources for this task, as currently available SLT datasets are restricted to a ...[+]
[EN] Current research into spoken language translation (SLT), or speech-to-text translation, is often hampered by the lack of specific data resources for this task, as currently available SLT datasets are restricted to a limited set of language pairs. In this paper we present Europarl-ST, a novel multilingual SLT corpus containing paired audio-text samples for SLT from and into 6 European languages, for a total of 30 different translation directions. This corpus has been compiled using the debates held in the European Parliament in the period between 2008 and 2012. This paper describes the corpus creation process and presents a series of automatic speech recognition, machine translation and spoken language translation experiments that highlight the potential of this new resource. The corpus is released under a Creative Commons license and is freely accessible and downloadable.
[-]
|
Palabras clave:
|
Speech translation
,
Spoken language translation
,
Automatic speech recognition
,
Machine translation
,
Multilingual corpu
|
Derechos de uso:
|
Reserva de todos los derechos
|
ISBN:
|
978-1-5090-6631-5
|
Fuente:
|
2020 IEEE InternationalConference on Acoustics, Speech,and Signal Processing. Proceedings.
|
DOI:
|
10.1109/ICASSP40776.2020.9054626
|
Editorial:
|
IEEE
|
Versión del editor:
|
https://doi.org/10.1109/ICASSP40776.2020.9054626
|
Título del congreso:
|
45th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2020)
|
Lugar del congreso:
|
Barcelona, España
|
Fecha congreso:
|
Mayo 04-08,2020
|
Código del Proyecto:
|
info:eu-repo/grantAgreement///761758//X5GON: CROSS MODAL, CROSS CULTURAL, CROSS LINGUAL, CROSS DOMAIN, AND CROSS SITE GLOBAL OER NETWORK/
info:eu-repo/grantAgreement/AEI//RTI2018-094879-B-I00-AR//SUBTITULACION MULTILINGÜE DE CLASES DE AULA Y SESIONES PLENARIAS/
info:eu-repo/grantAgreement/MCIU//FPU18%2F04135//Ayudas para contratos predoctorales para la Formación de Profesorado Universitario (FPU - Iranzo-Sánchez). Proyecto: Novel contributions to neural speech translation/
|
Agradecimientos:
|
The research leading to these results has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement no. 761758 (X5gon); MCIU/AEI/FEDER,UE under the Multisub ...[+]
The research leading to these results has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement no. 761758 (X5gon); MCIU/AEI/FEDER,UE under the Multisub (RTI2018-094879-B-I00) research project and the Government of Spain's FPU scholarship FPU18/04135.
[-]
|
Tipo:
|
Comunicación en congreso
Capítulo de libro
|