- -

Translating without In-domain Corpus: Machine TranslationPost-Editing with Online Learning Techniques

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

Translating without In-domain Corpus: Machine TranslationPost-Editing with Online Learning Techniques

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author Lagarda Arroyo, Antonio Luís es_ES
dc.contributor.author Ortiz Martínez, Daniel es_ES
dc.contributor.author Alabau, V. es_ES
dc.contributor.author Casacuberta Nolla, Francisco es_ES
dc.date.accessioned 2016-05-05T11:52:36Z
dc.date.available 2016-05-05T11:52:36Z
dc.date.issued 2015-07
dc.identifier.issn 0885-2308
dc.identifier.uri http://hdl.handle.net/10251/63702
dc.description.abstract [EN] Globalization has dramatically increased the need of translating information from one language to another. Frequently, such translation needs should be satisfied under very tight time constraints. Machine translation (MT) techniques can constitute a solution to this overly complex problem. However, the documents to be translated in real scenarios are often limited to a specific domain, such as a particular type of medical or legal text. This situation seriously hinders the applicability of MT, since it is usually expensive to build a reliable translation system, no matter what technology is used, due to the linguistic resources that are required to build them, such as dictionaries, translation memories or parallel texts. In order to solve this problem, we propose the application of automatic post-editing in an online learning framework. Our proposed technique allows the human expert to translate in a specific domain by using a base translation system designed to work in a general domain whose output is corrected (or adapted to the specific domain) by means of an automatic post-editing module. This automatic post-editing module learns to make its corrections from user feedback in real time by means of online learning techniques. We have validated our system using different translation technologies to implement the base translation system, as well as several texts involving different domains and languages. In most cases, our results show significant improvements in terms of BLEU (up to 16 points) with respect to the baseline systems. The proposed technique works effectively when the n-grams of the document to be translated presents a certain rate of repetition, situation which is common according to the document-internal repetition property. es_ES
dc.description.sponsorship Work partially supported by the European Union 7th Framework Programme (FP7/2007-2013) under the CasMaCat Project (Grant Agreement No. 287576), by Spanish MICINN under Grant TIN2012-31723, and by the Generalitat Valenciana under Grant ALMPR ALMAMATER (PROMETEUII/2014/030) and under Grant IMASI (ISIC/2012/004). en_EN
dc.language Inglés es_ES
dc.publisher Elsevier es_ES
dc.relation.ispartof Computer Speech and Language es_ES
dc.rights Reserva de todos los derechos es_ES
dc.subject Machine translation es_ES
dc.subject Statistical machine translation es_ES
dc.subject Interactive machine translation es_ES
dc.subject Automatic post-editing es_ES
dc.subject Online learning es_ES
dc.title Translating without In-domain Corpus: Machine TranslationPost-Editing with Online Learning Techniques es_ES
dc.type Artículo es_ES
dc.identifier.doi 10.1016/j.csl.2014.10.004
dc.relation.projectID info:eu-repo/grantAgreement/EC/FP7/287576/EU/Cognitive Analysis and Statistical Methods for Advanced Computer Aided Translation/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/MINECO//TIN2012-31723/ES/INTERACCION ACTIVA PARA TRANSCRIPCION DE HABLA Y TRADUCCION/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/GVA//PROMETEOII%2F2014%2F030/ES/ Adaptive learning and multimodality in machine translation and text transcription/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/GVA//ISIC%2F2012%2F004/ es_ES
dc.rights.accessRights Abierto es_ES
dc.contributor.affiliation Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació es_ES
dc.description.bibliographicCitation Lagarda Arroyo, AL.; Ortiz Martínez, D.; Alabau, V.; Casacuberta Nolla, F. (2015). Translating without In-domain Corpus: Machine TranslationPost-Editing with Online Learning Techniques. Computer Speech and Language. 32(1):109-134. https://doi.org/10.1016/j.csl.2014.10.004 es_ES
dc.description.accrualMethod S es_ES
dc.relation.publisherversion http://dx.doi.org/10.1016/j.csl.2014.10.004 es_ES
dc.description.upvformatpinicio 109 es_ES
dc.description.upvformatpfin 134 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.description.volume 32 es_ES
dc.description.issue 1 es_ES
dc.relation.senia 278450 es_ES
dc.contributor.funder European Commission
dc.contributor.funder Generalitat Valenciana
dc.contributor.funder Ministerio de Economía y Competitividad es_ES


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem