dc.contributor.advisor	Casacuberta Nolla, Francisco	es_ES
dc.contributor.advisor	Sanchis Navarro, José Alberto	es_ES
dc.contributor.author	Alabau Gonzalvo, Vicente	es_ES
dc.date.accessioned	2014-01-27T08:10:48Z
dc.date.available	2014-01-27T08:10:48Z
dc.date.created	2014-01-10T11:00:11Z	es_ES
dc.date.issued	2014-01-27T08:10:45Z	es_ES
dc.identifier.uri	http://hdl.handle.net/10251/35135
dc.description.abstract	This thesis presents scientific contributions to the field of multimodal interac- tive structured prediction (MISP). The aim of MISP is to reduce the human effort required to supervise an automatic output, in an efficient and ergonomic way. Hence, this thesis focuses on the two aspects of MISP systems. The first aspect, which refers to the interactive part of MISP, is the study of strate- gies for efficient human¿computer collaboration to produce error-free outputs. Multimodality, the second aspect, deals with other more ergonomic modalities of communication with the computer rather than keyboard and mouse. To begin with, in sequential interaction the user is assumed to supervise the output from left-to-right so that errors are corrected in sequential order. We study the problem under the decision theory framework and define an optimum decoding algorithm. The optimum algorithm is compared to the usually ap- plied, standard approach. Experimental results on several tasks suggests that the optimum algorithm is slightly better than the standard algorithm. In contrast to sequential interaction, in active interaction it is the system that decides what should be given to the user for supervision. On the one hand, user supervision can be reduced if the user is required to supervise only the outputs that the system expects to be erroneous. In this respect, we define a strategy that retrieves first the outputs with highest expected error first. Moreover, we prove that this strategy is optimum under certain conditions, which is validated by experimental results. On the other hand, if the goal is to reduce the number of corrections, active interaction works by selecting elements, one by one, e.g., words of a given output to be supervised by the user. For this case, several strategies are compared. Unlike the previous case, the strategy that performs better is to choose the element with highest confidence, which coincides with the findings of the optimum algorithm for sequential interaction. However, this also suggests that minimizing effort and supervision are contradictory goals. With respect to the multimodality aspect, this thesis delves into techniques to make multimodal systems more robust. To achieve that, multimodal systems are improved by providing contextual information of the application at hand. First, we study how to integrate e-pen interaction in a machine translation task. We contribute to the state-of-the-art by leveraging the information from the source sentence. Several strategies are compared basically grouped into two approaches: inspired by word-based translation models and n-grams generated from a phrase-based system. The experiments show that the former outper- forms the latter for this task. Furthermore, the results present remarkable improvements against not using contextual information. Second, similar ex- periments are conducted on a speech-enabled interface for interactive machine translation. The improvements over the baseline are also noticeable. How- ever, in this case, phrase-based models perform much better than word-based models. We attribute that to the fact that acoustic models are poorer estima- tions than morphologic models and, thus, they benefit more from the language model. Finally, similar techniques are proposed for dictation of handwritten documents. The results show that speech and handwritten recognition can be combined in an effective way. Finally, an evaluation with real users is carried out to compare an interactive machine translation prototype with a post-editing prototype. The results of the study reveal that users are very sensitive to the usability aspects of the user interface. Therefore, usability is a crucial aspect to consider in an human evaluation that can hinder the real benefits of the technology being evaluated. Hopefully, once usability problems are fixed, the evaluation indicates that users are more favorable to work with the interactive machine translation system than to the post-editing system.	en_EN
dc.language	Inglés	es_ES
dc.publisher	Universitat Politècnica de València	es_ES
dc.rights	Reserva de todos los derechos	es_ES
dc.source	Riunet	es_ES
dc.subject	Structured prediction	es_ES
dc.subject	Handwritten text transcription
dc.subject	Automatic speech recognition
dc.subject	Machine translation
dc.subject	Human interaction
dc.subject	Interactive pattern recognition
dc.subject	Multimodal interaction
dc.subject	Interactive machine translation
dc.subject	Computer assisted transcription
dc.subject	Active interaction
dc.subject	Passive interaction
dc.subject.classification	LENGUAJES Y SISTEMAS INFORMATICOS	es_ES
dc.title	Multimodal interactive structured prediction
dc.type	Tesis doctoral	es_ES
dc.identifier.doi	10.4995/Thesis/10251/35135	es_ES
dc.rights.accessRights	Abierto	es_ES
dc.contributor.affiliation	Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació	es_ES
dc.description.bibliographicCitation	Alabau Gonzalvo, V. (2014). Multimodal interactive structured prediction [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/35135	es_ES
dc.description.accrualMethod	TESIS	es_ES
dc.type.version	info:eu-repo/semantics/acceptedVersion	es_ES
dc.relation.tesis	8383	es_ES
dc.description.award	Premios Extraordinarios de tesis doctorales	es_ES

Multimodal interactive structured prediction

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Buscar en RiuNet

Listar

Todo RiuNet

Esta colección

Mi cuenta

Estadísticas

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Multimodal interactive structured prediction

Ficheros en el ítem

Este ítem aparece en la(s) siguiente(s) colección(ones)

Ítems relacionados