GREAT: open source software for statistical machine translation

González Mollá, Jorge; Casacuberta Nolla, Francisco

doi:10.1007/s10590-011-9097-6

Identificarse

Buscar en RiuNet

Listar

Todo RiuNet
Esta colección

Mi cuenta

Acceder

Estadísticas

Ver Estadísticas de uso

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

GREAT: open source software for statistical machine translation

Mostrar el registro sencillo del ítem

Ficheros en el ítem

Nombre: fosmt.pdf

Tamaño: 232.2Kb

Formato: PDF

Descripción: Versión del Autor.

Abrir

Nombre: González;Casacuberta ...

Tamaño: 313.2Kb

Formato: PDF

Descripción: Versión editorial

Solicitar una copia al autor

dc.contributor.author	González Mollá, Jorge	es_ES
dc.contributor.author	Casacuberta Nolla, Francisco	es_ES
dc.date.accessioned	2014-03-24T10:34:23Z
dc.date.issued	2011-06-01
dc.identifier.issn	0922-6567
dc.identifier.uri	http://hdl.handle.net/10251/36594
dc.description	The final publication is available at Springer via http://dx.doi.org/10.1007/s10590-011-9097-6	es_ES
dc.description.abstract	[EN] In this article, the first public release of GREAT as an open-source, statistical machine translation (SMT) software toolkit is described. GREAT is based on a bilingual language modelling approach for SMT, which is so far implemented for n-gram models based on the framework of stochastic finite-state transducers. The use of finite-state models is motivated by their simplicity, their versatility, and the fact that they present a lower computational cost, if compared with other more expressive models. Moreover, if translation is assumed to be a subsequential process, finite-state models are enough for modelling the existing relations between a source and a target language. GREAT includes some characteristics usually present in state-of-the-art SMT, such as phrase-based translation models or a log-linear framework for local features. Experimental results on a well-known corpus such as Europarl are reported in order to validate this software. A competitive translation quality is achieved, yet using both a lower number of model parameters and a lower response time than the widely-used, state-of-the-art SMT system Moses. © 2011 Springer Science+Business Media B.V.	es_ES
dc.description.sponsorship	Study was supported by the EC (FEDER, FSE), the Spanish government (MICINN, MITyC, “Plan E”, under Grants MIPRCV “Consolider Ingenio 2010”, iTrans2 TIN2009-14511, and erudito.com TSI-020110-2009-439), and the Generalitat Valenciana (Grant Prometeo/2009/014).
dc.format.extent	16	es_ES
dc.language	Inglés	es_ES
dc.publisher	Springer Netherlands	es_ES
dc.relation.ispartof	Machine Translation	es_ES
dc.rights	Reserva de todos los derechos	es_ES
dc.subject	Grammatical inference	es_ES
dc.subject	Language modelling	es_ES
dc.subject	Monotonic bilingual segmentation	es_ES
dc.subject	Statistical machine translation	es_ES
dc.subject	Stochastic finite-state transducers	es_ES
dc.subject.classification	LENGUAJES Y SISTEMAS INFORMATICOS	es_ES
dc.title	GREAT: open source software for statistical machine translation	es_ES
dc.type	Artículo	es_ES
dc.embargo.lift	10000-01-01
dc.embargo.terms	forever	es_ES
dc.identifier.doi	10.1007/s10590-011-9097-6
dc.relation.projectID	info:eu-repo/grantAgreement/MICINN//TIN2009-14511/ES/Traduccion De Textos Y Transcripcion De Voz Interactivas/	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/MITURCO//TSI-020110-2009-0439/ES/ERUDITO.COM/	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/Generalitat Valenciana//PROMETEO09%2F2009%2F014/ES/Adaptive learning and multimodality in pattern recognition (Almapater)/	es_ES
dc.rights.accessRights	Abierto	es_ES
dc.contributor.affiliation	Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació	es_ES
dc.description.bibliographicCitation	González Mollá, J.; Casacuberta Nolla, F. (2011). GREAT: open source software for statistical machine translation. Machine Translation. 25(2):145-160. https://doi.org/10.1007/s10590-011-9097-6	es_ES
dc.description.accrualMethod	S	es_ES
dc.relation.publisherversion	http://link.springer.com/article/10.1007%2Fs10590-011-9097-6	es_ES
dc.description.upvformatpinicio	145	es_ES
dc.description.upvformatpfin	160	es_ES
dc.type.version	info:eu-repo/semantics/publishedVersion	es_ES
dc.description.volume	25	es_ES
dc.description.issue	2	es_ES
dc.relation.senia	201860
dc.contributor.funder	Ministerio de Ciencia e Innovación
dc.contributor.funder	Generalitat Valenciana
dc.contributor.funder	Ministerio de Industria, Turismo y Comercio	es_ES
dc.description.references	Amengual JC, Benedí JM, Casacuberta F, Castaño MA, Castellanos A, Jiménez VM, Llorens D, Marzal A, Pastor M, Prat F, Vidal E, Vilar JM (2000) The EUTRANS-I speech translation system. Mach Transl 15(1-2): 75–103	es_ES
dc.description.references	Andrés-Ferrer J, Juan-Císcar A, Casacuberta F (2008) Statistical estimation of rational transducers applied to machine translation. Appl Artif Intell 22(1–2): 4–22	es_ES
dc.description.references	Bangalore S, Riccardi G (2002) Stochastic finite-state models for spoken language machine translation. Mach Transl 17(3): 165–184	es_ES
dc.description.references	Berstel J (1979) Transductions and context-free languages. B.G. Teubner, Stuttgart, Germany	es_ES
dc.description.references	Casacuberta F, Vidal E (2004) Machine translation with inferred stochastic finite-state transducers. Comput Linguist 30(2): 205–225	es_ES
dc.description.references	Casacuberta F, Vidal E (2007) Learning finite-state models for machine translation. Mach Learn 66(1): 69–91	es_ES
dc.description.references	Foster G, Kuhn R, Johnson H (2006) Phrasetable smoothing for statistical machine translation. In: Proceedings of the 11th Conference on Empirical Methods in Natural Language Processing, Stroudsburg, PA, pp 53–61	es_ES
dc.description.references	González J (2009) Aprendizaje de transductores estocásticos de estados finitos y su aplicación en traducción automática. PhD thesis, Universitat Politècnica de València. Advisor: Casacuberta F	es_ES
dc.description.references	González J, Casacuberta F (2009) GREAT: a finite-state machine translation toolkit implementing a grammatical inference approach for transducer inference (GIATI). In: Proceedings of the EACL Workshop on Computational Linguistic Aspects of Grammatical Inference, Athens, Greece, pp 24–32	es_ES
dc.description.references	Kanthak S, Vilar D, Matusov E, Zens R, Ney H (2005) Novel reordering approaches in phrase-based statistical machine translation. In: Proceedings of the ACL Workshop on Building and Using Parallel Texts: Data-Driven Machine Translation and Beyond, Ann Arbor, MI, pp 167–174	es_ES
dc.description.references	Karttunen L (2001) Applications of finite-state transducers in natural language processing. In: Proceedings of the 5th Conference on Implementation and Application of Automata, London, UK, pp 34–46	es_ES
dc.description.references	Kneser R, Ney H (1995) Improved backing-off for n-gram language modeling. In: Proceedings of the 20th IEEE International Conference on Acoustic, Speech and Signal Processing, Detroit, MI, pp 181–184	es_ES
dc.description.references	Knight K, Al-Onaizan Y (1998) Translation with finite-state devices. In: Proceedings of the 3rd Conference of the Association for Machine Translation in the Americas, Langhorne, PA, pp 421–437	es_ES
dc.description.references	Koehn P (2004) Statistical significance tests for machine translation evaluation. In: Proceedings of the 9th Conference on Empirical Methods in Natural Language Processing, Barcelona, Spain, pp 388–395	es_ES
dc.description.references	Koehn P (2005) Europarl: a parallel corpus for statistical machine translation. In: Proceedings of the 10th Machine Translation Summit, Phuket, Thailand, pp 79–86	es_ES
dc.description.references	Koehn P (2010) Statistical machine translation. Cambridge University Press, Cambridge, UK	es_ES
dc.description.references	Koehn P, Hoang H (2007) Factored translation models. In: Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Prague, Czech Republic, pp 868–876	es_ES
dc.description.references	Koehn P, Hoang H, Birch A, Callison-Burch C, Federico M, Bertoldi N, Cowan B, Shen W, Moran C, Zens R, Dyer C, Bojar O, Constantin A, Herbst E (2007) Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, Prague, Czech Republic, pp 177–180	es_ES
dc.description.references	Kumar S, Deng Y, Byrne W (2006) A weighted finite state transducer translation template model for statistical machine translation. Nat Lang Eng 12(1): 35–75	es_ES
dc.description.references	Li Z, Callison-Burch C, Dyer C, Ganitkevitch J, Khudanpur S, Schwartz L, Thornton WNG, Weese J, Zaidan OF (2009) Joshua: an open source toolkit for parsing-based machine translation. In: Procee- dings of the ACL Workshop on Statistical Machine Translation, Morristown, NJ, pp 135–139	es_ES
dc.description.references	Llorens D, Vilar JM, Casacuberta F (2002) Finite state language models smoothed using n-grams. Int J Pattern Recognit Artif Intell 16(3): 275–289	es_ES
dc.description.references	Marcu D, Wong W (2002) A phrase-based, joint probability model for statistical machine translation. In: Proceedings of the 7th Conference on Empirical Methods in Natural Language Processing, Morristown, NJ, pp 133–139	es_ES
dc.description.references	Mariño JB, Banchs RE, Crego JM, de Gispert A, Lambert P, Fonollosa JAR, Costa-jussà MR (2006) N-gram-based machine translation. Comput Linguist 32(4): 527–549	es_ES
dc.description.references	Medvedev YT (1964) On the class of events representable in a finite automaton. In: Moore EF (eds) Sequential machines selected papers. Addison Wesley, Reading, MA	es_ES
dc.description.references	Mohri M, Pereira F, Riley M (2002) Weighted finite-state transducers in speech recognition. Comput Speech Lang 16(1): 69–88	es_ES
dc.description.references	Och FJ, Ney H (2002) Discriminative training and maximum entropy models for statistical machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, PA, pp 295–302	es_ES
dc.description.references	Och FJ, Ney H (2003) A systematic comparison of various statistical alignment models. Comput Linguist 29(1): 19–51	es_ES
dc.description.references	Ortiz D, García-Varea I, Casacuberta F (2005) Thot: a toolkit to train phrase-based statistical translation models. In: Proceedings of the 10th Machine Translation Summit, Phuket, Thailand, pp 141–148	es_ES
dc.description.references	Papineni K, Roukos S, Ward T, Zhu WJ (2002) Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Philadelphia, PA, pp 311–318	es_ES
dc.description.references	Pérez A, Torres MI, Casacuberta F (2008) Joining linguistic and statistical methods for Spanish-to-Basque speech translation. Speech Commun 50: 1021–1033	es_ES
dc.description.references	Picó D, Casacuberta F (2001) Some statistical-estimation methods for stochastic finite-state transducers. Mach Learn 44: 121–142	es_ES
dc.description.references	Rosenfeld R (1996) A maximum entropy approach to adaptive statistical language modeling. Comput Speech Lang 10: 187–228	es_ES
dc.description.references	Simard M, Plamondon P (1998) Bilingual sentence alignment: balancing robustness and accuracy. Mach Transl 13(1): 59–80	es_ES
dc.description.references	Singh AK, Husain S (2007) Exploring translation similarities for building a better sentence aligner. In: Proceedings of the 3rd Indian International Conference on Artificial Intelligence, Pune, India, pp 1852–1863	es_ES
dc.description.references	Steinbiss V, Tran BH, Ney H (1994) Improvements in beam search. In: Proceedings of the 3rd International Conference on Spoken Language Processing, Yokohama, Japan, pp 2143–2146	es_ES
dc.description.references	Torres MI, Varona A (2001) k-TSS language models in speech recognition systems. Comput Speech Lang 15(2): 127–149	es_ES
dc.description.references	Vidal E (1997) Finite-state speech-to-speech translation. In: Proceedings of the 22nd IEEE International Conference on Acoustic, Speech and Signal Processing, Munich, Germany, pp 111–114	es_ES
dc.description.references	Vidal E, Thollard F, de la Higuera C, Casacuberta F, Carrasco RC (2005) Probabilistic finite-state machines–Part II. IEEE Trans Pattern Anal Mach Intell 27(7): 1025–1039	es_ES
dc.description.references	Viterbi A (1967) Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Trans Inf Theory 13(2): 260–269	es_ES

Este ítem aparece en la(s) siguiente(s) colección(ones)

Artículos, conferencias, monografías [45910]

Mostrar el registro sencillo del ítem

GREAT: open source software for statistical machine translation

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Buscar en RiuNet

Listar

Todo RiuNet

Esta colección

Mi cuenta

Estadísticas

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

GREAT: open source software for statistical machine translation

Ficheros en el ítem

Este ítem aparece en la(s) siguiente(s) colección(ones)