Europarl-ST: A Multilingual Corpus for Speech Translation of Parliamentary Debates

Iranzo-Sánchez, Javier; Silvestre Cerdà, Joan Albert; Jorge-Cano, Javier; Roselló, Nahuel; Giménez, Adriá; Sanchis Navarro, José Alberto; Civera Saiz, Jorge; Juan, Alfons

doi:10.1109/ICASSP40776.2020.9054626

Identificarse

Buscar en RiuNet

Listar

Todo RiuNet
Esta colección

Mi cuenta

Acceder

Estadísticas

Ver Estadísticas de uso

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Europarl-ST: A Multilingual Corpus for Speech Translation of Parliamentary Debates

Mostrar el registro sencillo del ítem

Ficheros en el ítem

Nombre: Iranzo-SanchezSil ...

Tamaño: 165.0Kb

Formato: PDF

Descripción: Versión del Autor.

Abrir

Nombre: 09054626.pdf

Tamaño: 299.6Kb

Formato: PDF

Descripción: Versión editorial

Solicitar una copia al autor

dc.contributor.author	Iranzo-Sánchez, Javier	es_ES
dc.contributor.author	Silvestre Cerdà, Joan Albert	es_ES
dc.contributor.author	Jorge-Cano, Javier	es_ES
dc.contributor.author	Roselló, Nahuel	es_ES
dc.contributor.author	Giménez, Adriá	es_ES
dc.contributor.author	Sanchis Navarro, José Alberto	es_ES
dc.contributor.author	Civera Saiz, Jorge	es_ES
dc.contributor.author	Juan, Alfons	es_ES
dc.date.accessioned	2024-02-06T11:13:38Z
dc.date.available	2024-02-06T11:13:38Z
dc.date.issued	2020-05-08	es_ES
dc.identifier.isbn	978-1-5090-6631-5	es_ES
dc.identifier.uri	http://hdl.handle.net/10251/202368
dc.description.abstract	[EN] Current research into spoken language translation (SLT), or speech-to-text translation, is often hampered by the lack of specific data resources for this task, as currently available SLT datasets are restricted to a limited set of language pairs. In this paper we present Europarl-ST, a novel multilingual SLT corpus containing paired audio-text samples for SLT from and into 6 European languages, for a total of 30 different translation directions. This corpus has been compiled using the debates held in the European Parliament in the period between 2008 and 2012. This paper describes the corpus creation process and presents a series of automatic speech recognition, machine translation and spoken language translation experiments that highlight the potential of this new resource. The corpus is released under a Creative Commons license and is freely accessible and downloadable.	es_ES
dc.description.sponsorship	The research leading to these results has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement no. 761758 (X5gon); MCIU/AEI/FEDER,UE under the Multisub (RTI2018-094879-B-I00) research project and the Government of Spain's FPU scholarship FPU18/04135.	es_ES
dc.language	Inglés	es_ES
dc.publisher	IEEE	es_ES
dc.relation.ispartof	2020 IEEE InternationalConference on Acoustics, Speech,and Signal Processing. Proceedings	es_ES
dc.rights	Reserva de todos los derechos	es_ES
dc.subject	Speech translation	es_ES
dc.subject	Spoken language translation	es_ES
dc.subject	Automatic speech recognition	es_ES
dc.subject	Machine translation	es_ES
dc.subject	Multilingual corpu	es_ES
dc.subject.classification	LENGUAJES Y SISTEMAS INFORMATICOS	es_ES
dc.title	Europarl-ST: A Multilingual Corpus for Speech Translation of Parliamentary Debates	es_ES
dc.type	Comunicación en congreso	es_ES
dc.type	Capítulo de libro	es_ES
dc.identifier.doi	10.1109/ICASSP40776.2020.9054626	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement///761758//X5GON: CROSS MODAL, CROSS CULTURAL, CROSS LINGUAL, CROSS DOMAIN, AND CROSS SITE GLOBAL OER NETWORK/	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/AEI//RTI2018-094879-B-I00-AR//SUBTITULACION MULTILINGÜE DE CLASES DE AULA Y SESIONES PLENARIAS/	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/MCIU//FPU18%2F04135//Ayudas para contratos predoctorales para la Formación de Profesorado Universitario (FPU - Iranzo-Sánchez). Proyecto: Novel contributions to neural speech translation/	es_ES
dc.rights.accessRights	Abierto	es_ES
dc.contributor.affiliation	Universitat Politècnica de València. Escola Tècnica Superior d'Enginyeria Informàtica	es_ES
dc.contributor.affiliation	Universitat Politècnica de València. Instituto Universitario Valenciano de Investigación en Inteligencia Artificial - Institut Universitari Valencià de Recerca en Intel·ligència Artificial	es_ES
dc.contributor.affiliation	Universitat Politècnica de València. Escuela Politécnica Superior de Alcoy - Escola Politècnica Superior d'Alcoi	es_ES
dc.description.bibliographicCitation	Iranzo-Sánchez, J.; Silvestre Cerdà, JA.; Jorge-Cano, J.; Roselló, N.; Giménez, A.; Sanchis Navarro, JA.; Civera Saiz, J.... (2020). Europarl-ST: A Multilingual Corpus for Speech Translation of Parliamentary Debates. IEEE. 8229-8233. https://doi.org/10.1109/ICASSP40776.2020.9054626	es_ES
dc.description.accrualMethod	S	es_ES
dc.relation.conferencename	45th IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2020)	es_ES
dc.relation.conferencedate	Mayo 04-08,2020	es_ES
dc.relation.conferenceplace	Barcelona, España	es_ES
dc.relation.publisherversion	https://doi.org/10.1109/ICASSP40776.2020.9054626	es_ES
dc.description.upvformatpinicio	8229	es_ES
dc.description.upvformatpfin	8233	es_ES
dc.type.version	info:eu-repo/semantics/publishedVersion	es_ES
dc.relation.pasarela	S\412783	es_ES
dc.contributor.funder	European Regional Development Fund	es_ES
dc.contributor.funder	Ministerio de Ciencia, Innovación y Universidades	es_ES

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem

Europarl-ST: A Multilingual Corpus for Speech Translation of Parliamentary Debates

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Buscar en RiuNet

Listar

Todo RiuNet

Esta colección

Mi cuenta

Estadísticas

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Europarl-ST: A Multilingual Corpus for Speech Translation of Parliamentary Debates

Ficheros en el ítem

Este ítem aparece en la(s) siguiente(s) colección(ones)