- -

Enabling network inference methods to handle missing data and outliers

RiuNet: Institutional repository of the Polithecnic University of Valencia

Share/Send to

Cited by


Enabling network inference methods to handle missing data and outliers

Show full item record

Folch-Fortuny, A.; Fernández Villaverde, A.; Ferrer Riquelme, AJ.; Rodríguez Banga, J. (2015). Enabling network inference methods to handle missing data and outliers. BMC Bioinformatics. 16(283):1-12. https://doi.org/10.1186/s12859-015-0717-7

Por favor, use este identificador para citar o enlazar este ítem: http://hdl.handle.net/10251/64905

Files in this item

Item Metadata

Title: Enabling network inference methods to handle missing data and outliers
Author: Folch-Fortuny, Abel Fernández Villaverde, Alejandro Ferrer Riquelme, Alberto José Rodríguez Banga, Julio
UPV Unit: Universitat Politècnica de València. Departamento de Estadística e Investigación Operativa Aplicadas y Calidad - Departament d'Estadística i Investigació Operativa Aplicades i Qualitat
Issued date:
[EN] Background: The inference of complex networks from data is a challenging problem in biological sciences, as well as in a wide range of disciplines such as chemistry, technology, economics, or sociology. The quantity ...[+]
Subjects: Network inference , Missing data , Outlier detection , Projection to latent structures , Trimmed scores regression , Information theory , Mutual information
Copyrigths: Reconocimiento (by)
BMC Bioinformatics. (issn: 1471-2105 )
DOI: 10.1186/s12859-015-0717-7
BioMed Central
Publisher version: https://dx.doi.org/10.1186/s12859-015-0717-7
Project ID:
Xunta de Galicia/I2C ED481B 2014/133-0
Description: © 2015 Folch-Fortuny et al. Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Research in this study was partially supported by the European Union through project BioPreDyn (FP7-KBBE 289434), and the Spanish Ministry of Science and Innovation and FEDER funds from the European Union through grants ...[+]
Type: Artículo


Albert R, Barabási AL. Statistical mechanics of complex networks. Rev Mod Phys. 2002; 74(1):47–97.

Newman MEJ. The structure and function of complex networks. SIAM Rev. 2003; 45(2):167–256.

De Smet R, Marchal K. Advantages and limitations of current network inference methods. Nat Rev Microbiol. 2010; 8(10):717–29. [+]
Albert R, Barabási AL. Statistical mechanics of complex networks. Rev Mod Phys. 2002; 74(1):47–97.

Newman MEJ. The structure and function of complex networks. SIAM Rev. 2003; 45(2):167–256.

De Smet R, Marchal K. Advantages and limitations of current network inference methods. Nat Rev Microbiol. 2010; 8(10):717–29.

Marbach D, Prill RJ, Schaffter T, Mattiussi C, Floreano D, Stolovitzky G. Revealing strengths and weaknesses of methods for gene network inference. Proc Natl Acad Sci. 2010; 107(14):6286–291.

Prill RJ, Saez-Rodriguez J, Alexopoulos LG, Sorger PK, Stolovitzky G. Crowdsourcing network inference: the DREAM predictive signaling network challenge. Sci Signal. 2011; 4(189):7.

Lecca P, Priami C. Biological network inference for drug discovery. Drug Discovery Today. 2013; 18(5-6):256–64.

Maetschke SR, Madhamshettiwar PB, Davis MJ, Ragan MA. Supervised, semi-supervised and unsupervised inference of gene regulatory networks. Brief Bioinform. 2013; 15(2):195–211.

Grung B, Manne R. Missing values in principal component analysis. Chemometr Intell Lab Syst. 1998; 42(1-2):125–39.

Arteaga F, Ferrer A. Missing data. In: Comprehensive chemometrics chemical and biochemical data analysis. Amsterdam: Elsevier: 2009. p. 285–314.

Jackson JE. A user’s guide to principal components. Hoboken: Wiley Ser Probab Stat; 2004.

Walczak B, Massart DL. Dealing with missing data. Chemometr Intell Lab Syst. 2001; 58(1):15–27.

Martens H, Jr Russwurm H. Food research and data analysis. London; New York, NY, USA: Elsevier Applied Science; 1983.

Arteaga F, Ferrer A. Dealing with missing data in MSPC: Several methods, different interpretations, some examples. J Chemom. 2002; 16(8-10):408–18.

Folch-Fortuny A, Arteaga F, Ferrer A. PCA model building with missing data: new proposals and a comparative study. Chemometr Intell Lab Syst. 2015; 146:77–88.

Liao SG, Lin Y, Kang DD, Chandra D, Bon J, Kaminski N, et al.Missing value imputation in high-dimensional phenomic data: imputable or not, and how?BMC Bioinforma. 2014; 15(1):346.

Wold S, Esbensen K, Geladi P. Principal component analysis. Chemometr Intell Lab Syst. 1987; 2(1-3):37–52.

Kourti T, MacGregor JF. Process analysis, monitoring and diagnosis, using multivariate projection methods. Chemometr Intell Lab Syst. 1995; 28(1):3–21.

Ferrer A. Latent structures-based multivariate statistical process control: A paradigm shift. Qual Eng. 2014; 26(1):72–91.

Villaverde AF, Ross J, Morán F, Banga JR. MIDER: Network inference with mutual information distance and entropy reduction. PLoS ONE. 2014; 9(5):96732.

Shannon CE. A mathematical theory of communication. Bell Sys Tech J. 1948; 27(3):379–423.

Cover TM, Thomas JA. Elements of information theory, 99 ed. New York: Wiley-Interscience; 1991.

Villaverde AF, Ross J, Banga JR. Reverse engineering cellular networks with information theoretic methods. Cells. 2013; 2(2):306–29.

Faith JJ, Hayete B, Thaden JT, Mogno I, Wierzbowski J, Cottarel G, et al.Large-scale mapping and validation of escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biol. 2007; 5(1):8.

Margolin AA, Nemenman I, Basso K, Wiggins C, Stolovitzky G, Favera RD, et al.ARACNE: An algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinforma. 2006; 7(Suppl 1):7.

Meyer PE, Kontos K, Lafitte F, Bontempi G. Information-theoretic inference of large transcriptional regulatory networks. EURASIP J Bioinforma Syst Biol. 2007; 2007(1):79879.

Luo W, Hankenson KD, Woolf PJ. Learning transcriptional regulatory networks from high throughput gene expression data using continuous three-way mutual information. BMC Bioinforma. 2008; 9:467.

Zoppoli P, Morganella S, Ceccarelli M. TimeDelay-ARACNE: Reverse engineering of gene networks from time-course data by an information theoretic approach. BMC bioinforma. 2010; 11:154.

Wu CC, Huang HC, Juan HF, Chen ST. GeneNetwork: an interactive tool for reconstruction of genetic networks using microarray data. Bioinformatics (Oxford, England). 2004; 20(18):3691–693.

Gustafsson M, Hörnquist M, Lombardi A. Constructing and analyzing a large-scale gene-to-gene regulatory network–lasso-constrained inference and biological validation. IEEE/ACM trans comput biol bioinform/IEEE, ACM. 2005; 2(3):254–61.

Guthke R, Möller U, Hoffmann M, Thies F, Töpfer S. Dynamic network reconstruction from gene expression data applied to immune response during bacterial infection. Bioinformatics (Oxford, England). 2005; 21(8):1626–34.

Schulze S, Henkel SG, Driesch D, Guthke R, Linde J. Computational prediction of molecular pathogen-host interactions based on dual transcriptome data. Front Microbiol. 2015; 6:65.

Hurley D, Araki H, Tamada Y, Dunmore B, Sanders D, Humphreys S, et al.Gene network inference and visualization tools for biologists: application to new human transcriptome datasets. Nucleic Acids Res. 2012; 40(6):2377–398.

Souto MCd, Jaskowiak PA, Costa IG. Impact of missing data imputation methods on gene expression clustering and classification. BMC Bioinforma. 2015; 16(1):64.

Guitart-Pla O, Kustagi M, Rügheimer F, Califano A, Schwikowski B. The Cyni framework for network inference in Cytoscape. Bioinformatics (Oxford, England). 2015; 31(9):1499–1501.

Camacho J, Picó J, Ferrer A. Data understanding with PCA: Structural and variance information plots. Chemometr Intell Lab Syst. 2010; 100(1):48–56.

Wold S. Cross-validatory estimation of the number of components in factor and principal components models. Technometrics. 1978; 20(4):397–405.

Camacho J, Ferrer A. Cross-validation in PCA models with the element-wise k-fold (ekf) algorithm: theoretical aspects. J Chemom. 2012; 26(7):361–73.

Little RJA, Rubin DB. Statistical analysis with missing data, 2nd ed. Hoboken, NJ: Wiley-Interscience; 2002.

Ferrer A. Multivariate statistical process control based on principal component analysis (MSPC-PCA): Some reflections and a case study in an autobody assembly process. Qual Eng. 2007; 19(4):311–25.

MacGregor JF, Kourti T. Statistical process control of multivariate processes. Control Eng Pract. 1995; 3(3):403–14.

Stanimirova I, Daszykowski M, Walczak B. Dealing with missing values and outliers in principal component analysis. Talanta. 2007; 72(1):172–8.

Abdi H, Williams LJ. Principal component analysis. Wiley Interdiscip Rev Comput Stat. 2010; 2(4):433–59.

Camacho J, Picó J, Ferrer A. The best approaches in the on-line monitoring of batch processes based on PCA: Does the modelling structure matter?Anal Chim Acta. 2009; 642(1-2):59–68.

González-Martínez JM, de Noord OE, Ferrer A. Multisynchro: a novel approach for batch synchronization in scenarios of multiple asynchronisms. J Chemom. 2014; 28(5):462–75.

Samoilov MS. Reconstruction and Functional Analysis of General Chemical Reactions and Reaction Networks. California, United States: Stanford University; 1997.

Samoilov M, Arkin A, Ross J. On the deduction of chemical reaction pathways from measurements of time series of concentrations. Chaos (Woodbury, NY). 2001; 11(1):108–14.

Cantone I, Marucci L, Iorio F, Ricci MA, Belcastro V, Bansal M, et al.A yeast synthetic network for in vivo assessment of reverse-engineering and modeling approaches. Cell. 2009; 137(1):172–81.

Arkin A, Shen P, Ross J. A test case of correlation metric construction of a reaction pathway from measurements. Science. 1997; 277(5330):1275–9.

Schaffter T, Marbach D, Floreano D. GeneNetWeaver: in silico benchmark generation and performance profiling of network inference methods. Bioinformatics (Oxford, England). 2011; 27(16):2263–270.

Marbach D, Schaffter T, Mattiussi C, Floreano D. Generating realistic in silico gene networks for performance assessment of reverse engineering methods. J Comput Biol J Comput Mol Cell Biol. 2009; 16(2):229–39.


This item appears in the following Collection(s)

Show full item record