González Rubio, J.; Casacuberta Nolla, F. (2014). Cost-sensitive active learning for computer-assisted translation. Pattern Recognition Letters. 37(1):124-134. https://doi.org/10.1016/j.patrec.2013.06.007
Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10251/40333
Título:
|
Cost-sensitive active learning for computer-assisted translation
|
Autor:
|
González Rubio, Jesús
Casacuberta Nolla, Francisco
|
Entidad UPV:
|
Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació
|
Fecha difusión:
|
|
Resumen:
|
[EN] Machine translation technology is not perfect. To be successfully embedded in real-world applications, it must compensate for its imperfections by interacting intelligently with the user within a computer-assisted ...[+]
[EN] Machine translation technology is not perfect. To be successfully embedded in real-world applications, it must compensate for its imperfections by interacting intelligently with the user within a computer-assisted translation framework. The interactive¿predictive paradigm, where both a statistical translation model and a human expert collaborate to generate the translation, has been shown to be an effective computer-assisted translation approach. However, the exhaustive supervision of all translations and the use of non-incremental translation models penalizes the productivity of conventional interactive¿predictive systems.
We propose a cost-sensitive active learning framework for computer-assisted translation whose goal is to make the translation process as painless as possible. In contrast to conventional active learning scenarios, the proposed active learning framework is designed to minimize not only how many translations the user must supervise but also how difficult each translation is to supervise. To do that, we address the two potential drawbacks of the interactive-predictive translation paradigm. On the one hand, user effort is focused to those translations whose user supervision is considered more ¿informative¿, thus, maximizing the utility of each user interaction. On the other hand, we use a dynamic machine translation model that is continually updated with user feedback after deployment. We empirically validated each of the technical components in simulation and quantify the user effort saved. We conclude that both selective translation supervision and translation model updating lead to important user-effort reductions, and consequently to improved translation productivity.
[-]
|
Palabras clave:
|
Computer-assisted translation
,
Interactive machine translation
,
Active learning
,
Online learning
|
Derechos de uso:
|
Reserva de todos los derechos
|
Fuente:
|
Pattern Recognition Letters. (issn:
0167-8655
)
|
DOI:
|
10.1016/j.patrec.2013.06.007
|
Editorial:
|
Elsevier
|
Versión del editor:
|
http://dx.doi.org/10.1016/j.patrec.2013.06.007
|
Código del Proyecto:
|
info:eu-repo/grantAgreement/EC/FP7/287576/EU/Cognitive Analysis and Statistical Methods for Advanced Computer Aided Translation/
info:eu-repo/grantAgreement/Generalitat Valenciana//PROMETEO09%2F2009%2F014/ES/Adaptive learning and multimodality in pattern recognition (Almapater)/
info:eu-repo/grantAgreement/MINECO//TIN2012-31723/ES/INTERACCION ACTIVA PARA TRANSCRIPCION DE HABLA Y TRADUCCION/
|
Descripción:
|
This is the author’s version of a work that was accepted for publication in Pattern Recognition Letters. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Pattern Recognition Letters, [Volume 37, 1 February 2014, Pages 124–134] DOI: 10.1016/j.patrec.2013.06.007
|
Agradecimientos:
|
Work supported by the European Union Seventh Framework Program (FP7/2007-2013) under the CasMaCat Project (Grants agreement No. 287576), by the Generalitat Valenciana under Grant ALMPR (Prometeo/2009/014), and by the Spanish ...[+]
Work supported by the European Union Seventh Framework Program (FP7/2007-2013) under the CasMaCat Project (Grants agreement No. 287576), by the Generalitat Valenciana under Grant ALMPR (Prometeo/2009/014), and by the Spanish Government under Grant TIN2012-31723. The authors thank Daniel Ortiz-Martinez for providing us with the log-linear SMT model with incremental features and the corresponding online learning algorithms. The authors also thank the anonymous reviewers for their criticisms and suggestions.
[-]
|
Tipo:
|
Artículo
|