- -

Probabilistic change detection and visualization methods for the assessment of temporal stability in biomedical data quality

RiuNet: Institutional repository of the Polithecnic University of Valencia

Share/Send to

Cited by

Statistics

  • Estadisticas de Uso

Probabilistic change detection and visualization methods for the assessment of temporal stability in biomedical data quality

Show full item record

Sáez Silvestre, C.; Pereira Rodrigues, P.; Gama, J.; Robles Viejo, M.; García Gómez, JM. (2014). Probabilistic change detection and visualization methods for the assessment of temporal stability in biomedical data quality. Data Mining and Knowledge Discovery. 28:1-1. doi:10.1007/s10618-014-0378-6

Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10251/50768

Files in this item

Item Metadata

Title: Probabilistic change detection and visualization methods for the assessment of temporal stability in biomedical data quality
Author: Sáez Silvestre, Carlos Pereira Rodrigues, Pedro Gama, João Robles Viejo, Montserrat García Gómez, Juan Miguel
UPV Unit: Universitat Politècnica de València. Instituto Universitario de Aplicaciones de las Tecnologías de la Información - Institut Universitari d'Aplicacions de les Tecnologies de la Informació
Universitat Politècnica de València. Departamento de Física Aplicada - Departament de Física Aplicada
Issued date:
Abstract:
Knowledge discovery on biomedical data can be based on on-line, data-stream analyses, or using retrospective, timestamped, off-line datasets. In both cases, changes in the processes that generate data or in their quality ...[+]
Subjects: Data quality , Change detection , Information theory , Information geometry , Visual analytics , Biomedical data
Copyrigths: Reserva de todos los derechos
Source:
Data Mining and Knowledge Discovery. (issn: 1384-5810 )
DOI: 10.1007/s10618-014-0378-6
Publisher:
Springer Verlag (Germany)
Publisher version: http://link.springer.com/article/10.1007/s10618-014-0378-6
Description: The final publication is available at Springer via http://dx.doi.org/DOI 10.1007/s10618-014-0378-6. Published online.
Thanks:
The work by C Saez has been supported by an Erasmus Lifelong Learning Programme 2013 Grant. This work has been supported by own IBIME funds. The authors thank Dr. Gregor Stiglic, from the Univeristy of Maribor, Slovenia, ...[+]
Type: Artículo

References

Aggarwal C (2003) A framework for diagnosing changes in evolving data streams. In Proceedings of the International Conference on Management of Data ACM SIGMOD, pp 575–586

Amari SI, Nagaoka H (2007) Methods of information geometry. American Mathematical Society, Providence, RI

Arias E (2014) United states life tables, 2009. Natl Vital Statist Rep 62(7): 1–63 [+]
Aggarwal C (2003) A framework for diagnosing changes in evolving data streams. In Proceedings of the International Conference on Management of Data ACM SIGMOD, pp 575–586

Amari SI, Nagaoka H (2007) Methods of information geometry. American Mathematical Society, Providence, RI

Arias E (2014) United states life tables, 2009. Natl Vital Statist Rep 62(7): 1–63

Aspden P, Corrigan JM, Wolcott J, Erickson SM (2004) Patient safety: achieving a new standard for care. Committee on data standards for patient safety. The National Academies Press, Washington, DC

Basseville M, Nikiforov IV (1993) Detection of abrupt changes: theory and application. Prentice-Hall Inc, Upper Saddle River, NJ

Borg I, Groenen PJF (2010) Modern multidimensional scaling: theory and applications. Springer, Berlin

Bowman AW, Azzalini A (1997) Applied smoothing techniques for data analysis: the Kernel approach with S-plus illustrations (Oxford statistical science series). Oxford University Press, Oxford

Brandes U, Pich C (2007) Eigensolver methods for progressive multidimensional scaling of large data. In: Kaufmann M, Wagner D (eds) Graph drawing. Lecture notes in computer science, vol 4372. Springer, Berlin, pp 42–53

Brockwell P, Davis R (2009) Time series: theory and methods., Springer series in statisticsSpringer, Berlin

Cesario SK (2002) The “Christmas Effect” and other biometeorologic influences on childbearing and the health of women. J Obstet Gynecol Neonatal Nurs 31(5):526–535

Chakrabarti K, Garofalakis M, Rastogi R, Shim K (2001) Approximate query processing using wavelets. VLDB J 10(2–3):199–223

Cruz-Correia RJ, Pereira Rodrigues P, Freitas A, Canario Almeida F, Chen R, Costa-Pereira A (2010) Data quality and integration issues in electronic health records. Information discovery on electronic health records, pp 55–96

Csiszár I (1967) Information-type measures of difference of probability distributions and indirect observations. Studia Sci Math Hungar 2:299–318

Dasu T, Krishnan S, Lin D, Venkatasubramanian S, Yi K (2009) Change (detection) you can believe. In: Finding distributional shifts in data streams. In: Proceedings of the 8th international symposium on intelligent data analysis: advances in intelligent data analysis VIII, IDA ’09. Springer, Berlin, pp 21–34

Endres D, Schindelin J (2003) A new metric for probability distributions. IEEE Trans Inform Theory 49(7):1858–1860

Gama J, Gaber MM (2007) Learning from data streams: processing techniques in sensor networks. Springer, Berlin

Gama J, Medas P, Castillo G, Rodrigues P (2004) Learning with drift detection. In: Bazzan A, Labidi S (eds) Advances in artificial intelligence—SBIA 2004., Lecture notes in computer scienceSpringer, Berlin, pp 286–295

Gama J (2010) Knowledge discovery from data streams, 1st edn. Chapman & Hall, London

Gehrke J, Korn F, Srivastava D (2001) On computing correlated aggregates over continual data streams. SIGMOD Rec 30(2):13–24

Guha S, Shim K, Woo J (2004) Rehist: relative error histogram construction algorithms. In: Proceedings of the thirtieth international conference on very large data bases VLDB, pp 300–311

Han J, Kamber M, Pei J (2012) Data mining: concepts and techniques. Morgan Kaufmann, Elsevier, Burlington, MA

Howden LM, Meyer JA, (2011) Age and sex composition. 2010 Census Briefs US Department of Commerce. Economics and Statistics Administration, US Census Bureau

Hrovat G, Stiglic G, Kokol P, Ojstersek M (2014) Contrasting temporal trend discovery for large healthcare databases. Comput Methods Program Biomed 113(1):251–257

Keim DA (2000) Designing pixel-oriented visualization techniques: theory and applications. IEEE Trans Vis Comput Graph 6(1):59–78

Kifer D, Ben-David S, Gehrke J (2004) Detecting change in data streams. In: Proceedings of the thirtieth international conference on Very large data bases, VLDB Endowment, VLDB ’04, vol 30, pp 180–191

Klinkenberg R, Renz I (1998) Adaptive information filtering: Learning in the presence of concept drifts. In: Workshop notes of the ICML/AAAI-98 workshop learning for text categorization. AAAI Press, Menlo Park, pp 33–40

Kohonen T (1982) Self-organized formation of topologically correct feature maps. Biolog Cybern 43(1):59–69

Lin J (1991) Divergence measures based on the Shannon entropy. IEEE Trans Inform Theory 37:145–151

Mitchell TM, Caruana R, Freitag D, McDermott J, Zabowski D (1994) Experience with a learning personal assistant. Commun ACM 37(7):80–91

Mouss H, Mouss D, Mouss N, Sefouhi L (2004) Test of page-hinckley, an approach for fault detection in an agro-alimentary production system. In: Proceedings of the 5th Asian Control Conference, vol 2, pp 815–818

National Research Council (2011) Explaining different levels of longevity in high-income countries. The National Academies Press, Washington, DC

NHDS (2010) United states department of health and human services. Centers for disease control and prevention. National center for health statistics. National hospital discharge survey codebook

NHDS (2014) National Center for Health Statistics, National Hospital Discharge Survey (NHDS) data, US Department of Health and Human Services, Centers for Disease Control and Prevention, National Center for Health Statistics, Hyattsville, Maryland. http://www.cdc.gov/nchs/nhds.htm

Papadimitriou S, Sun J, Faloutsos C (2005) Streaming pattern discovery in multiple time-series. In: Proceedings of the 31st international conference on very large data bases, VLDB endowment, VLDB ’05, pp 697–708

Parzen E (1962) On estimation of a probability density function and mode. Ann Math Statist 33(3):1065–1076

Ramsay JO, Silverman BW (2005) Functional data analysis. Springer, New York

Rodrigues P, Correia R (2013) Streaming virtual patient records. In: Krempl G, Zliobaite I, Wang Y, Forman G (eds) Real-world challenges for data stream mining. University Magdeburg, Otto-von-Guericke, pp 34–37

Rodrigues P, Gama J, Pedroso J (2008) Hierarchical clustering of time-series data streams. IEEE Trans Knowl Data Eng 20(5):615–627

Rodrigues PP, Gama Ja (2010) A simple dense pixel visualization for mobile sensor data mining. In: Proceedings of the second international conference on knowledge discovery from sensor data, sensor-KDD’08. Springer, Berlin, pp 175–189

Rodrigues PP, Gama J, Sebastiã o R (2010) Memoryless fading windows in ubiquitous settings. In Proceedings of ubiquitous data mining (UDM) workshop in conjunction with the 19th european conference on artificial intelligence—ECAI 2010, pp 27–32

Rodrigues PP, Sebastiã o R, Santos CC (2011) Improving cardiotocography monitoring: a memory-less stream learning approach. In: Proceedings of the learning from medical data streams workshop. Bled, Slovenia

Rubner Y, Tomasi C, Guibas L (2000) The earth mover’s distance as a metric for image retrieval. Int J Comput Vision 40(2):99–121

Sebastião R, Gama J (2009) A study on change detection methods. In: 4th Portuguese conference on artificial intelligence

Sebastião R, Gama J, Rodrigues P, Bernardes J (2010) Monitoring incremental histogram distribution for change detection in data streams. In: Gaber M, Vatsavai R, Omitaomu O, Gama J, Chawla N, Ganguly A (eds) Knowledge discovery from sensor data, vol 5840., Lecture notes in computer science. Springer, Berlin, pp 25–42

Sebastião R, Silva M, Rabiço R, Gama J, Mendonça T (2013) Real-time algorithm for changes detection in depth of anesthesia signals. Evol Syst 4(1):3–12

Sáez C, Martínez-Miranda J, Robles M, García-Gómez JM (2012) O rganizing data quality assessment of shifting biomedical data. Stud Health Technol Inform 180:721–725

Sáez C, Robles M, García-Gómez JM (2013) Comparative study of probability distribution distances to define a metric for the stability of multi-source biomedical research data. In: Engineering in medicine and biology society (EMBC), 2013 35th annual international conference of the IEEE, pp 3226–3229

Sáez C, Robles M, García-Gómez JM (2014) Stability metrics for multi-source biomedical data based on simplicial projections from probability distribution distances. Statist Method Med Res (forthcoming)

Shewhart WA, Deming WE (1939) Statistical method from the viewpoint of quality control. Graduate School of the Department of Agriculture, Washington, DC

Shimazaki H, Shinomoto S (2010) Kernel bandwidth optimization in spike rate estimation. J Comput Neurosci 29(1–2):171–182

Solberg LI, Engebretson KI, Sperl-Hillen JM, Hroscikoski MC, O’Connor PJ (2006) Are claims data accurate enough to identify patients for performance measures or quality improvement? the case of diabetes, heart disease, and depression. Am J Med Qual 21(4):238–245

Spiliopoulou M, Ntoutsi I, Theodoridis Y, Schult R (2006) monic: modeling and monitoring cluster transitions. In: Proceedings of the 12th ACm SIGKDD international conference on knowledge discovery and data mining, KDD ’06. ACm, New York, NY, pp 706–711

Stiglic G, Kokol P (2011) Interpretability of sudden concept drift in medical informatics domain. In Proceedings of the 2010 IEEE international conference on data mining workshops, pp 609–613

Torgerson W (1952) Multidimensional scaling: I theory and method. Psychometrika 17(4):401–419

Wang RY, Strong DM (1996) Beyond accuracy: what data quality means to data consumers. J Manage Inform Syst 12(4):5–33

Weiskopf NG, Weng C (2013) M ethods and dimensions of electronic health record data quality assessment: enabling reuse for clinical research. J Am Med Inform Assoc 20(1):144–151

Wellings K, Macdowall W, Catchpole M, Goodrich J (1999) Seasonal variations in sexual activity and their implications for sexual health promotion. J R Soc Med 92(2):60–64

Westgard JO, Barry PL (2010) Basic QC practices: training in statistical quality control for medical laboratories. Westgard Quality Corporation, Madison, WI

Widmer G, Kubat M (1996) Learning in the presence of concept drift and hidden contexts. Mach Learn 23(1):69–101

[-]

recommendations

 

This item appears in the following Collection(s)

Show full item record