Mostrar el registro sencillo del ítem
dc.contributor.author | Sáez Silvestre, Carlos | es_ES |
dc.contributor.author | Gutiérrez-Sacristán, Alba | es_ES |
dc.contributor.author | Kohane, Isaac | es_ES |
dc.contributor.author | Garcia-Gomez, Juan M | es_ES |
dc.contributor.author | Avillach, Paul | es_ES |
dc.date.accessioned | 2021-05-28T03:33:51Z | |
dc.date.available | 2021-05-28T03:33:51Z | |
dc.date.issued | 2020-07-30 | es_ES |
dc.identifier.uri | http://hdl.handle.net/10251/166908 | |
dc.description.abstract | [EN] Background: Temporal variability in health-care processes or protocols is intrinsic to medicine. Such variability can potentially introduce dataset shifts, a data quality issue when reusing electronic health records (EHRs) for secondary purposes. Temporal data-set shifts can present as trends, as well as abrupt or seasonal changes in the statistical distributions of data over time. The latter are particularly complicated to address in multimodal and highly coded data. These changes, if not delineated, can harm population and data-driven research, such as machine learning. Given that biomedical research repositories are increasingly being populated with large sets of historical data from EHRs, there is a need for specific software methods to help delineate temporal data-set shifts to ensure reliable data reuse. Results: EHRtemporalVariability is an open-source R package and Shiny app designed to explore and identify temporal data-set shifts. EHRtemporalVariability estimates the statistical distributions of coded and numerical data over time; projects their temporal evolution through non-parametric information geometric temporal plots; and enables the exploration of changes in variables through data temporal heat maps. We demonstrate the capability of EHRtemporalVariability to delineate data-set shifts in three impact case studies, one of which is available for reproducibility. Conclusions: EHRtemporalVariability enables the exploration and identification of data-set shifts, contributing to the broad examination and repurposing of large, longitudinal data sets. Our goal is to help ensure reliable data reuse for a wide range of biomedical data users. EHRtemporalVariability is designed for technical users who are programmatically utilizing the R package, as well as users who are not familiar with programming via the Shiny user interface. | es_ES |
dc.description.sponsorship | This work was supported by Universitat Politecnica de Valencia grant PAID-00-17, Generalitat Valenciana grant BEST/2018, and projects H2020-SC1-2016-CNECT No. 727560 and H2020-SC1-BHC-2018-2020 No. 825750 | es_ES |
dc.language | Inglés | es_ES |
dc.publisher | Oxford University Press | es_ES |
dc.relation.ispartof | GigaScience | es_ES |
dc.rights | Reserva de todos los derechos | es_ES |
dc.subject | Data-set shifts | es_ES |
dc.subject | Data quality | es_ES |
dc.subject | Temporal variability | es_ES |
dc.subject | Scientific data sets | es_ES |
dc.subject | Electronic health records | es_ES |
dc.subject | Claims data | es_ES |
dc.subject | Research repositories | es_ES |
dc.subject | Information geometry | es_ES |
dc.subject | Visual analytics | es_ES |
dc.subject | R package | es_ES |
dc.subject.classification | FISICA APLICADA | es_ES |
dc.title | EHRtemporalVariability: delineating temporal data-set shifts in electronic health records | es_ES |
dc.type | Artículo | es_ES |
dc.identifier.doi | 10.1093/gigascience/giaa079 | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/EC/H2020/727560/EU/Collective wisdom driving public health policies/ | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/UPV//PAID-00-17/ | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/EC/H2020/825750/EU/Patient-centred pathways of early palliative care, supportive ecosystems and appraisal standard/ | es_ES |
dc.rights.accessRights | Abierto | es_ES |
dc.contributor.affiliation | Universitat Politècnica de València. Departamento de Física Aplicada - Departament de Física Aplicada | es_ES |
dc.description.bibliographicCitation | Sáez Silvestre, C.; Gutiérrez-Sacristán, A.; Kohane, I.; Garcia-Gomez, JM.; Avillach, P. (2020). EHRtemporalVariability: delineating temporal data-set shifts in electronic health records. GigaScience. 9(8):1-7. https://doi.org/10.1093/gigascience/giaa079 | es_ES |
dc.description.accrualMethod | S | es_ES |
dc.relation.publisherversion | https://doi.org/10.1093/gigascience/giaa079 | es_ES |
dc.description.upvformatpinicio | 1 | es_ES |
dc.description.upvformatpfin | 7 | es_ES |
dc.type.version | info:eu-repo/semantics/publishedVersion | es_ES |
dc.description.volume | 9 | es_ES |
dc.description.issue | 8 | es_ES |
dc.identifier.eissn | 2047-217X | es_ES |
dc.identifier.pmid | 32729900 | es_ES |
dc.identifier.pmcid | PMC7391413 | es_ES |
dc.relation.pasarela | S\418366 | es_ES |
dc.contributor.funder | Generalitat Valenciana | es_ES |
dc.contributor.funder | European Commission | es_ES |
dc.contributor.funder | Universitat Politècnica de València | es_ES |
dc.description.references | Gewin, V. (2016). Data sharing: An open mind on open data. Nature, 529(7584), 117-119. doi:10.1038/nj7584-117a | es_ES |
dc.description.references | Katzan, I. L., & Rudick, R. A. (2012). Time to Integrate Clinical and Research Informatics. Science Translational Medicine, 4(162). doi:10.1126/scitranslmed.3004583 | es_ES |
dc.description.references | Zhu, L., & Zheng, W. J. (2018). Informatics, Data Science, and Artificial Intelligence. JAMA, 320(11), 1103. doi:10.1001/jama.2018.8211 | es_ES |
dc.description.references | Rajkomar, A., Dean, J., & Kohane, I. (2019). Machine Learning in Medicine. New England Journal of Medicine, 380(14), 1347-1358. doi:10.1056/nejmra1814259 | es_ES |
dc.description.references | Andreu-Perez, J., Poon, C. C. Y., Merrifield, R. D., Wong, S. T. C., & Yang, G.-Z. (2015). Big Data for Health. IEEE Journal of Biomedical and Health Informatics, 19(4), 1193-1208. doi:10.1109/jbhi.2015.2450362 | es_ES |
dc.description.references | Sáez, C., Rodrigues, P. P., Gama, J., Robles, M., & García-Gómez, J. M. (2014). Probabilistic change detection and visualization methods for the assessment of temporal stability in biomedical data quality. Data Mining and Knowledge Discovery, 29(4), 950-975. doi:10.1007/s10618-014-0378-6 | es_ES |
dc.description.references | Schlegel, D. R., & Ficheur, G. (2017). Secondary Use of Patient Data: Review of the Literature Published in 2016. Yearbook of Medical Informatics, 26(01), 68-71. doi:10.15265/iy-2017-032 | es_ES |
dc.description.references | Agniel, D., Kohane, I. S., & Weber, G. M. (2018). Biases in electronic health record data due to processes within the healthcare system: retrospective observational study. BMJ, k1479. doi:10.1136/bmj.k1479 | es_ES |
dc.description.references | Sáez, C., & García-Gómez, J. M. (2018). Kinematics of Big Biomedical Data to characterize temporal variability and seasonality of data repositories: Functional Data Analysis of data temporal evolution over non-parametric statistical manifolds. International Journal of Medical Informatics, 119, 109-124. doi:10.1016/j.ijmedinf.2018.09.015 | es_ES |
dc.description.references | Leek, J. T., Scharpf, R. B., Bravo, H. C., Simcha, D., Langmead, B., Johnson, W. E., … Irizarry, R. A. (2010). Tackling the widespread and critical impact of batch effects in high-throughput data. Nature Reviews Genetics, 11(10), 733-739. doi:10.1038/nrg2825 | es_ES |
dc.description.references | Goh, W. W. B., Wang, W., & Wong, L. (2017). Why Batch Effects Matter in Omics Data, and How to Avoid Them. Trends in Biotechnology, 35(6), 498-507. doi:10.1016/j.tibtech.2017.02.012 | es_ES |
dc.description.references | Sáez, C., Zurriaga, O., Pérez-Panadés, J., Melchor, I., Robles, M., & García-Gómez, J. M. (2016). Applying probabilistic temporal and multisite data quality control methods to a public health mortality registry in Spain: a systematic approach to quality control of repositories. Journal of the American Medical Informatics Association, 23(6), 1085-1095. doi:10.1093/jamia/ocw010 | es_ES |
dc.description.references | Wright, A., Ash, J. S., Aaron, S., Ai, A., Hickman, T.-T. T., Wiesen, J. F., … Sittig, D. F. (2018). Best practices for preventing malfunctions in rule-based clinical decision support alerts and reminders: Results of a Delphi study. International Journal of Medical Informatics, 118, 78-85. doi:10.1016/j.ijmedinf.2018.08.001 | es_ES |
dc.description.references | Moreno-Torres, J. G., Raeder, T., Alaiz-Rodríguez, R., Chawla, N. V., & Herrera, F. (2012). A unifying view on dataset shift in classification. Pattern Recognition, 45(1), 521-530. doi:10.1016/j.patcog.2011.06.019 | es_ES |
dc.description.references | Svolba, G., & Bauer, P. (1999). Statistical Quality Control in Clinical Trials. Controlled Clinical Trials, 20(6), 519-530. doi:10.1016/s0197-2456(99)00029-x | es_ES |
dc.description.references | Bray, F., & Parkin, D. M. (2009). Evaluation of data quality in the cancer registry: Principles and methods. Part I: Comparability, validity and timeliness. European Journal of Cancer, 45(5), 747-755. doi:10.1016/j.ejca.2008.11.032 | es_ES |
dc.description.references | Springate, D. A., Parisi, R., Olier, I., Reeves, D., & Kontopantelis, E. (2017). rEHR: An R package for manipulating and analysing Electronic Health Record data. PLOS ONE, 12(2), e0171784. doi:10.1371/journal.pone.0171784 | es_ES |
dc.description.references | Choi, L., Carroll, R. J., Beck, C., Mosley, J. D., Roden, D. M., Denny, J. C., & Van Driest, S. L. (2018). Evaluating statistical approaches to leverage large clinical datasets for uncovering therapeutic and adverse medication effects. Bioinformatics, 34(17), 2988-2996. doi:10.1093/bioinformatics/bty306 | es_ES |
dc.description.references | Gutiérrez-Sacristán, A., Bravo, À., Giannoula, A., Mayer, M. A., Sanz, F., & Furlong, L. I. (2018). comoRbidity: an R package for the systematic analysis of disease comorbidities. Bioinformatics, 34(18), 3228-3230. doi:10.1093/bioinformatics/bty315 | es_ES |
dc.description.references | Denny, J. C., Bastarache, L., Ritchie, M. D., Carroll, R. J., Zink, R., Mosley, J. D., … Roden, D. M. (2013). Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data. Nature Biotechnology, 31(12), 1102-1111. doi:10.1038/nbt.2749 | es_ES |
dc.description.references | Khera, R., Dorsey, K. B., & Krumholz, H. M. (2018). Transition to the ICD-10 in the United States. JAMA, 320(2), 133. doi:10.1001/jama.2018.6823 | es_ES |