- -

Kinematics of Big Biomedical Data to characterize temporal variability and seasonality of data repositories: Functional Data Analysis of data temporal evolution over non-parametric statistical manifolds

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

Kinematics of Big Biomedical Data to characterize temporal variability and seasonality of data repositories: Functional Data Analysis of data temporal evolution over non-parametric statistical manifolds

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author Sáez, Carlos es_ES
dc.contributor.author Garcia-Gomez, Juan M es_ES
dc.date.accessioned 2019-09-05T20:04:35Z
dc.date.available 2019-09-05T20:04:35Z
dc.date.issued 2018 es_ES
dc.identifier.issn 1386-5056 es_ES
dc.identifier.uri http://hdl.handle.net/10251/125106
dc.description.abstract [EN] Aim: The increasing availability of Big Biomedical Data is leading to large research data samples collected over long periods of time. We propose the analysis of the kinematics of data probability distributions over time towards the characterization of data temporal variability. Methods: First, we propose a kinematic model based on the estimation of a continuous data temporal trajectory, using Functional Data Analysis over the embedding of a non-parametric statistical manifold which points represent data temporal batches, the Information Geometric Temporal (IGT) plot. This model allows measuring the velocity and acceleration of data changes. Next, we propose a coordinate-free method to characterize the oriented seasonality of data based on the parallelism of lagged velocity vectors of the data trajectory throughout the IGT space, the Auto-Parallelism of Velocity Vectors (APVV) and APVVmap. Finally, we automatically explain the maximum variance components of the IGT space coordinates by means of correlating data points with known temporal factors from the domain application. Materials: Methods are evaluated on the US National Hospital Discharge Survey open dataset, consisting of 3,25M hospital discharges between 2000 and 2010. Results: Seasonal and abrupt behaviours were present on the estimated multivariate and univariate data trajectories. The kinematic analysis revealed seasonal effects and punctual increments in data celerity, the latter mainly related to abrupt changes in coding. The APVV and APVVmap revealed oriented seasonal changes on data trajectories. For most variables, their distributions tended to change to the same direction at a 12-month period, with a peak of change of directionality at mid and end of the year. Diagnosis and Procedure codes also included a 9-month periodic component. Kinematics and APVV methods were able to detect seasonal effects on extreme temporal subgrouped data, such as in Procedure code, where Fourier and autocorrelation methods were not able to. The automated explanation of IGT space coordinates was consistent with the results provided by the kinematic and seasonal analysis. Coordinates received different meanings according to the trajectory trend, seasonality and abrupt changes. Discussion: Treating data as a particle moving over time through a multidimensional probabilistic space and studying the kinematics of its trajectory has turned out to a new temporal variability methodology. Its results on the NHDS were aligned with the dataset and population descriptions found in the literature, contributing with a novel temporal variability characterization. We have demonstrated that the APVV and APVVmat are an appropriate tool for the coordinate-free and oriented analysis of trajectories or complex multivariate signals. Conclusion: The proposed methods comprise an exploratory methodology for the characterization of data temporal variability, what may be useful for a reliable reuse of Big Biomedical Data repositories acquired over long periods of time. es_ES
dc.description.sponsorship This work was supported by UPV grant No. PAID-00-17, and projects DPI2016-80054-R and H2020-SC1-2016-CNECT No. 727560. es_ES
dc.language Inglés es_ES
dc.publisher Elsevier es_ES
dc.relation.ispartof International Journal of Medical Informatics es_ES
dc.rights Reconocimiento - No comercial - Sin obra derivada (by-nc-nd) es_ES
dc.subject Temporal stability es_ES
dc.subject Data quality es_ES
dc.subject Time series es_ES
dc.subject Data reuse es_ES
dc.subject Big data es_ES
dc.subject Seasonality es_ES
dc.subject Coordinate-free es_ES
dc.subject Trajectories es_ES
dc.subject Functional data analysis es_ES
dc.subject Statistical manifolds es_ES
dc.subject.classification FISICA APLICADA es_ES
dc.title Kinematics of Big Biomedical Data to characterize temporal variability and seasonality of data repositories: Functional Data Analysis of data temporal evolution over non-parametric statistical manifolds es_ES
dc.type Artículo es_ES
dc.identifier.doi 10.1016/j.ijmedinf.2018.09.015 es_ES
dc.relation.projectID info:eu-repo/grantAgreement/EC/H2020/727560/EU/Collective wisdom driving public health policies/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/MINECO//DPI2016-80054-R/ES/BIOMARCADORES DINAMICOS BASADOS EN FIRMAS TISULARES MULTIPARAMETRICAS PARA EL SEGUIMIENTO Y EVALUACION DE LA RESPUESTA A TRATAMIENTO DE PACIENTES CON GLIOBLASTOMA Y CANCER DE PRÓSTATA/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/UPV//PAID-00-17/
dc.rights.accessRights Abierto es_ES
dc.contributor.affiliation Universitat Politècnica de València. Departamento de Física Aplicada - Departament de Física Aplicada es_ES
dc.description.bibliographicCitation Sáez, C.; Garcia-Gomez, JM. (2018). Kinematics of Big Biomedical Data to characterize temporal variability and seasonality of data repositories: Functional Data Analysis of data temporal evolution over non-parametric statistical manifolds. International Journal of Medical Informatics. 119:109-124. https://doi.org/10.1016/j.ijmedinf.2018.09.015 es_ES
dc.description.accrualMethod S es_ES
dc.relation.publisherversion https://doi.org/10.1016/j.ijmedinf.2018.09.015 es_ES
dc.description.upvformatpinicio 109 es_ES
dc.description.upvformatpfin 124 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.description.volume 119 es_ES
dc.identifier.pmid 30342679
dc.relation.pasarela S\385248 es_ES
dc.contributor.funder European Commission es_ES
dc.contributor.funder Universitat Politècnica de València
dc.contributor.funder Ministerio de Economía y Competitividad es_ES


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem