- -

Europarl-ST: A Multilingual Corpus for Speech Translation of Parliamentary Debates

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

Europarl-ST: A Multilingual Corpus for Speech Translation of Parliamentary Debates

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author Iranzo-Sánchez, Javier es_ES
dc.contributor.author Silvestre Cerdà, Joan Albert es_ES
dc.contributor.author Jorge-Cano, Javier es_ES
dc.contributor.author Roselló, Nahuel es_ES
dc.contributor.author Giménez, Adriá es_ES
dc.contributor.author Sanchis Navarro, José Alberto es_ES
dc.contributor.author Civera Saiz, Jorge es_ES
dc.contributor.author Juan, Alfons es_ES
dc.date.accessioned 2024-02-06T11:13:38Z
dc.date.available 2024-02-06T11:13:38Z
dc.date.issued 2020-05-08 es_ES
dc.identifier.isbn 978-1-5090-6631-5 es_ES
dc.identifier.uri http://hdl.handle.net/10251/202368
dc.description.abstract [EN] Current research into spoken language translation (SLT), or speech-to-text translation, is often hampered by the lack of specific data resources for this task, as currently available SLT datasets are restricted to a limited set of language pairs. In this paper we present Europarl-ST, a novel multilingual SLT corpus containing paired audio-text samples for SLT from and into 6 European languages, for a total of 30 different translation directions. This corpus has been compiled using the debates held in the European Parliament in the period between 2008 and 2012. This paper describes the corpus creation process and presents a series of automatic speech recognition, machine translation and spoken language translation experiments that highlight the potential of this new resource. The corpus is released under a Creative Commons license and is freely accessible and downloadable. es_ES
dc.description.sponsorship The research leading to these results has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement no. 761758 (X5gon); MCIU/AEI/FEDER,UE under the Multisub (RTI2018-094879-B-I00) research project and the Government of Spain's FPU scholarship FPU18/04135. es_ES
dc.language Inglés es_ES
dc.publisher IEEE es_ES
dc.relation.ispartof 2020 IEEE InternationalConference on Acoustics, Speech,and Signal Processing. Proceedings es_ES
dc.rights Reserva de todos los derechos es_ES
dc.subject Speech translation es_ES
dc.subject Spoken language translation es_ES
dc.subject Automatic speech recognition es_ES
dc.subject Machine translation es_ES
dc.subject Multilingual corpu es_ES
dc.subject.classification LENGUAJES Y SISTEMAS INFORMATICOS es_ES
dc.title Europarl-ST: A Multilingual Corpus for Speech Translation of Parliamentary Debates es_ES
dc.type Comunicación en congreso es_ES
dc.type Capítulo de libro es_ES
dc.identifier.doi 10.1109/ICASSP40776.2020.9054626 es_ES
dc.relation.projectID info:eu-repo/grantAgreement///761758//X5GON: CROSS MODAL, CROSS CULTURAL, CROSS LINGUAL, CROSS DOMAIN, AND CROSS SITE GLOBAL OER NETWORK/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/AEI//RTI2018-094879-B-I00-AR//SUBTITULACION MULTILINGÜE DE CLASES DE AULA Y SESIONES PLENARIAS/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/MCIU//FPU18%2F04135//Ayudas para contratos predoctorales para la Formación de Profesorado Universitario (FPU - Iranzo-Sánchez). Proyecto: Novel contributions to neural speech translation/ es_ES
dc.rights.accessRights Abierto es_ES
dc.contributor.affiliation Universitat Politècnica de València. Escola Tècnica Superior d'Enginyeria Informàtica es_ES
dc.contributor.affiliation Universitat Politècnica de València. Instituto Universitario Valenciano de Investigación en Inteligencia Artificial - Institut Universitari Valencià de Recerca en Intel·ligència Artificial es_ES
dc.contributor.affiliation Universitat Politècnica de València. Escuela Politécnica Superior de Alcoy - Escola Politècnica Superior d'Alcoi es_ES
dc.description.bibliographicCitation Iranzo-Sánchez, J.; Silvestre Cerdà, JA.; Jorge-Cano, J.; Roselló, N.; Giménez, A.; Sanchis Navarro, JA.; Civera Saiz, J.... (2020). Europarl-ST: A Multilingual Corpus for Speech Translation of Parliamentary Debates. IEEE. 8229-8233. https://doi.org/10.1109/ICASSP40776.2020.9054626 es_ES
dc.description.accrualMethod S es_ES
dc.relation.conferencename 45th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2020) es_ES
dc.relation.conferencedate Mayo 04-08,2020 es_ES
dc.relation.conferenceplace Barcelona, España es_ES
dc.relation.publisherversion https://doi.org/10.1109/ICASSP40776.2020.9054626 es_ES
dc.description.upvformatpinicio 8229 es_ES
dc.description.upvformatpfin 8233 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.relation.pasarela S\412783 es_ES
dc.contributor.funder European Regional Development Fund es_ES
dc.contributor.funder Ministerio de Ciencia, Innovación y Universidades es_ES


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem