- -

Multisource and temporal variability in Portuguese hospital administrative datasets: Data quality implications

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

Multisource and temporal variability in Portuguese hospital administrative datasets: Data quality implications

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author Souza, Júlio es_ES
dc.contributor.author Caballero, Ismael es_ES
dc.contributor.author Vasco Santos, Joao es_ES
dc.contributor.author Lobo, Mariana es_ES
dc.contributor.author Pinto, Andreia es_ES
dc.contributor.author Viana,Joao es_ES
dc.contributor.author Sáez Silvestre, Carlos es_ES
dc.contributor.author Lopes, Fernando es_ES
dc.contributor.author Freitas, Alberto es_ES
dc.date.accessioned 2023-06-21T18:02:11Z
dc.date.available 2023-06-21T18:02:11Z
dc.date.issued 2022-12 es_ES
dc.identifier.issn 1532-0464 es_ES
dc.identifier.uri http://hdl.handle.net/10251/194472
dc.description.abstract [EN] Background: Unexpected variability across healthcare datasets may indicate data quality issues and thereby affect the credibility of these data for reutilization. No gold-standard reference dataset or methods for variability assessment are usually available for these datasets. In this study, we aim to describe the process of discovering data quality implications by applying a set of methods for assessing variability between sources and over time in a large hospital database. Methods: We described and applied a set of multisource and temporal variability assessment methods in a large Portuguese hospitalization database, in which variation in condition-specific hospitalization ratios derived from clinically coded data were assessed between hospitals (sources) and over time. We identified condition-specific admissions using the Clinical Classification Software (CCS), developed by the Agency of Health Care Research and Quality. A Statistical Process Control (SPC) approach based on funnel plots of condition-specific standardized hospitalization ratios (SHR) was used to assess multisource variability, whereas temporal heat maps and Information-Geometric Temporal (IGT) plots were used to assess temporal variability by displaying temporal abrupt changes in data distributions. Results were presented for the 15 most common inpatient conditions (CCS) in Portugal. Main findings: Funnel plot assessment allowed the detection of several outlying hospitals whose SHRs were much lower or higher than expected. Adjusting SHR for hospital characteristics, beyond age and sex, considerably affected the degree of multisource variability for most diseases. Overall, probability distributions changed over time for most diseases, although heterogeneously. Abrupt temporal changes in data distributions for acute myocardial infarction and congestive heart failure coincided with the periods comprising the transition to the International Classification of Diseases, 10th revision, Clinical Modification, whereas changes in the DiagnosisRelated Groups software seem to have driven changes in data distributions for both acute myocardial infarction and liveborn admissions. The analysis of heat maps also allowed the detection of several discontinuities at hospital level over time, in some cases also coinciding with the aforementioned factors. Conclusions: This paper described the successful application of a set of reproducible, generalizable and systematic methods for variability assessment, including visualization tools that can be useful for detecting abnormal patterns in healthcare data, also addressing some limitations of common approaches. The presented method for multisource variability assessment is based on SPC, which is an advantage considering the lack of gold standard for such process. Properly controlling for hospital characteristics and differences in case-mix for estimating SHR is critical for isolating data quality-related variability among data sources. The use of IGT plots provides an advantage over common methods for temporal variability assessment due its suitability for multitype and multimodal data, which are common characteristics of healthcare data. The novelty of this work is the use of a set of methods to discover new data quality insights in healthcare data. es_ES
dc.description.sponsorship The authors would like to thank the Central Authority for Health Services, I.P. (ACSS) for providing access to the data. The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was financed by FEDER-Fundo Europeu de Desenvolvimento Regional funds through the COMPETE 2020-Operacional Programme for Competitiveness and Internationalisation (POCI) and by Portuguese funds through FCT- Fundacao para a Ciencia e a Tecnologia in the framework of the project POCI-01-0145-FEDER-030766 ("1st.IndiQare-Quality indicators in primary health care: validation and implementation of quality indicators as an assessment and comparison tool") . In addition, we would like to thank to projects GEMA (SBPLY/17/180501/000293) -Generation and Evaluation of Models for Data Quality, and ADAGIO (SBPLY/21/180501/000061) - Alarcos Data Governance framework and systems generation, both funded by the Department of Education, Culture and Sports of the JCCM and FEDER; and to AETHER-UCLM: A smart data holistic approach for context -aware data analytics focused on Quality and Security project (Ministerio de Ciencia e Innovacion, PID2020- 112540RB-C42) . CSS thanks the Universitat Politecnica de Valencia contract no. UPV-SUB.2-1302 and FONDO SUPERA COVID-19 by CRUE- Santander Bank grant "Severity Subgroup Discovery and Classification on COVID-19 Real World Data through Machine Learning and Data Quality assessment (SUBCOVERWD-19) ." es_ES
dc.language Inglés es_ES
dc.publisher Elsevier es_ES
dc.relation.ispartof Journal of Biomedical Informatics es_ES
dc.rights Reconocimiento - No comercial - Sin obra derivada (by-nc-nd) es_ES
dc.subject Data quality es_ES
dc.subject Clinical coding es_ES
dc.subject Data variability es_ES
dc.subject Clinical classification software es_ES
dc.subject International classification of diseases es_ES
dc.subject.classification FISICA APLICADA es_ES
dc.title Multisource and temporal variability in Portuguese hospital administrative datasets: Data quality implications es_ES
dc.type Artículo es_ES
dc.identifier.doi 10.1016/j.jbi.2022.104242 es_ES
dc.relation.projectID info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2020-112540RB-C42/ES/UNA APROXIMACION HOLISTICA DE SMART DATA PARA EL ANALISIS DE DATOS GUIADO POR EL CONTEXTO CENTRADA EN LA CALIDAD Y LA SEGURIDAD / es_ES
dc.relation.projectID info:eu-repo/grantAgreement/UPV//UPV-SUB.2-1302/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/FEDER//POCI-01-0145-FEDER-030766//1st.IndiQare-Quality indicators in primary health care: validation and implementation of quality indicators as an assessment and comparison tool/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/JCCM//SBPLY%2F17%2F180501%2F000293//GEMA-Generation and Evaluation of Models for Data Quality/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/JCCM//SBPLY%2F21%2F 180501%2F000061//ADAGIO Alarcos Data Governance framework and systems generation/ es_ES
dc.rights.accessRights Abierto es_ES
dc.contributor.affiliation Universitat Politècnica de València. Escuela Técnica Superior de Ingenieros Industriales - Escola Tècnica Superior d'Enginyers Industrials es_ES
dc.description.bibliographicCitation Souza, J.; Caballero, I.; Vasco Santos, J.; Lobo, M.; Pinto, A.; Viana, J.; Sáez Silvestre, C.... (2022). Multisource and temporal variability in Portuguese hospital administrative datasets: Data quality implications. Journal of Biomedical Informatics. 136:1-11. https://doi.org/10.1016/j.jbi.2022.104242 es_ES
dc.description.accrualMethod S es_ES
dc.relation.publisherversion https://doi.org/10.1016/j.jbi.2022.104242 es_ES
dc.description.upvformatpinicio 1 es_ES
dc.description.upvformatpfin 11 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.description.volume 136 es_ES
dc.identifier.pmid 36372346 es_ES
dc.relation.pasarela S\483292 es_ES
dc.contributor.funder Agencia Estatal de Investigación es_ES
dc.contributor.funder European Regional Development Fund es_ES
dc.contributor.funder Universitat Politècnica de València es_ES
dc.contributor.funder Junta de Comunidades de Castilla-La Mancha es_ES
dc.contributor.funder Fundação para a Ciência e a Tecnologia, Portugal es_ES


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem