Generation of synthetic data based on hidden Markov models

Ferrando Huertas, Jaime

Identificarse

Buscar en RiuNet

Listar

Todo RiuNet
Esta colección

Mi cuenta

Acceder

Estadísticas

Ver Estadísticas de uso

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Generation of synthetic data based on hidden Markov models

Mostrar el registro sencillo del ítem

Ficheros en el ítem

Nombre: FERRANDO - Generation ...

Tamaño: 3.381Mb

Formato: PDF

Abrir

dc.contributor.advisor	Civera Saiz, Jorge	es_ES
dc.contributor.advisor	Lagergren, Jens	es_ES
dc.contributor.author	Ferrando Huertas, Jaime	es_ES
dc.date.accessioned	2018-09-04T11:05:31Z
dc.date.available	2018-09-04T11:05:31Z
dc.date.created	2018-07-20
dc.date.issued	2018-09-04	es_ES
dc.identifier.uri	http://hdl.handle.net/10251/106545
dc.description.abstract	[EN] Machine learning has becoming a trending topic in the last years, being now one of the most demanding careers in computer science. This growing has lead to more complex models capable of driving a car or cancer detection, however this models improvements are also thanks to the improvements in computational power. In this study we investigate a data exploration technique for creating synthetic data, a field of Machine learning that does not have as much improvements in the last years. Our project comes from a industrial process where data is a valuable asset, this process has both computational power and power full models but struggles with the availability of the data. In response for this a model for generating data is proposed, aiming to fill the lack of data during data exploration and training of this industrial process. This model consist of a Hidden Markov Model where states represent different distributions the data follows, data is created by traveling through this states with an algorithm that uses the prior distribution of these states in a Dirichlet distribution. The method to infer data distributions from the given data and create this Hidden Markov Model model has been explained along with the technique used to travel between states. Results have been presented showing how the data inferring performed and how the synthetic data reproduces the original one, taking special care for the reproduction of specific features in the original data. To get a better perspective of the data we created we tricked the states for our model, creating data from all of the states or from the states with less prior probability. Results showed that the model is capable of creating data similar to the real one but it struggled with data with a small amount of significant outliers. In conclusion a model to create reliable data have been introduced along with a list of possible improvements.	es_ES
dc.description.abstract	[ES] Hoy en día un gran número de empresas están integrando técnicas de aprendizaje computacional en su modelo de negocio. Estas técnicas requieren de grandes cantidades de datos, lo cual no es siempre el caso para todas las empresas. Para dar solución a este problema real, se plantea la posibilidad de desarrollar un generador de datos sintéticos. En este proyecto se estudia un método fiable para generar datos sintéticos a partir de una colección de datos ya existente, de forma que los datos sintéticos se asemejen a la colección de datos dada. Para ello analizaremos las distribuciones estadísticas que siguen la colección de datos dada y estimaremos sus parámetros para crear un modelo oculto de Markov en el que cada estado contendrá los parámetros de cada distribución, que será inferida a partir de la colección de datos dada. Una vez derivado el modelo oculto de Markov se desarrollará un algoritmo de generación de datos sintéticos que transite entre los estados del modelos de Markov entrenado. Por último, analizaremos la validez de estos datos sintéticos con un detector de comportamiento anormal basado en máquinas de vectores soporte y/o redes neuronales recurrentes.	es_ES
dc.format.extent	38	es_ES
dc.language	Inglés	es_ES
dc.publisher	Universitat Politècnica de València	es_ES
dc.rights	Reserva de todos los derechos	es_ES
dc.subject	Markov models	es_ES
dc.subject	Synthetic data	es_ES
dc.subject	Detection of abnormal behaviour	es_ES
dc.subject	Modelo de Markov	es_ES
dc.subject	Datos sintéticas	es_ES
dc.subject	Detección comportamiento anormal	es_ES
dc.subject.classification	LENGUAJES Y SISTEMAS INFORMATICOS	es_ES
dc.subject.other	Grado en Ingeniería Informática-Grau en Enginyeria Informàtica	es_ES
dc.title	Generation of synthetic data based on hidden Markov models	es_ES
dc.title.alternative	Generación de datos sintéticos basada en modelos ocultos de Markov	es_ES
dc.type	Proyecto/Trabajo fin de carrera/grado	es_ES
dc.rights.accessRights	Abierto	es_ES
dc.contributor.affiliation	Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació	es_ES
dc.contributor.affiliation	Universitat Politècnica de València. Escola Tècnica Superior d'Enginyeria Informàtica	es_ES
dc.description.bibliographicCitation	Ferrando Huertas, J. (2018). Generation of synthetic data based on hidden Markov models. http://hdl.handle.net/10251/106545	es_ES
dc.description.accrualMethod	TFGM	es_ES
dc.relation.pasarela	TFGM\88403	es_ES

Este ítem aparece en la(s) siguiente(s) colección(ones)

ETSINF - Trabajos académicos [4804]
Escola Tècnica Superior d'Enginyeria Informàtica

Mostrar el registro sencillo del ítem

Generation of synthetic data based on hidden Markov models

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Buscar en RiuNet

Listar

Todo RiuNet

Esta colección

Mi cuenta

Estadísticas

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Generation of synthetic data based on hidden Markov models

Ficheros en el ítem

Este ítem aparece en la(s) siguiente(s) colección(ones)