- -

PLS model building with missing data: New algorithms and a comparative study

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

PLS model building with missing data: New algorithms and a comparative study

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author Folch-Fortuny, Abel es_ES
dc.contributor.author Arteaga, Francisco es_ES
dc.contributor.author Ferrer, Alberto es_ES
dc.date.accessioned 2020-10-29T04:32:18Z
dc.date.available 2020-10-29T04:32:18Z
dc.date.issued 2017-07 es_ES
dc.identifier.issn 0886-9383 es_ES
dc.identifier.uri http://hdl.handle.net/10251/153468
dc.description.abstract [EN] New algorithms to deal with missing values in predictive modelling are presented in this article. Specifically, 2 trimmed scores regression adaptations are proposed, one from principal component analysis model building with missing data (MD) and other from partial least squares regression model exploitation with missing values. Using these methods, practitioners can impute MD both in the explanatory/predictor and the dependent/response variables. Partial least squares is used here to build the multivariate calibration models; however, any regression method can be used after MD imputation. Four case studies, with different latent structures, are analysed here to compare the trimmed scores regression¿based methods against state-of-the-art approaches. The MATLAB code for these methods is also provided for its direct implementation at http://mseg.webs.upv.es, under a GNU license. es_ES
dc.description.sponsorship Spanish Ministry of Science and Innovation; FEDER; European Union, Grant/Award Number: DPI2011-28112-C04-02 and DPI2014-55276-C5-1R; Spanish Ministry of Economy and Competitiveness, Grant/Award Number: ECO2013-43353-R es_ES
dc.language Inglés es_ES
dc.publisher John Wiley & Sons es_ES
dc.relation.ispartof Journal of Chemometrics es_ES
dc.rights Reserva de todos los derechos es_ES
dc.subject Imputation es_ES
dc.subject Missing data es_ES
dc.subject Multivariate calibration es_ES
dc.subject Partial least squares regression (PLS) es_ES
dc.subject Trimmed scores regression (TSR) es_ES
dc.subject.classification ESTADISTICA E INVESTIGACION OPERATIVA es_ES
dc.title PLS model building with missing data: New algorithms and a comparative study es_ES
dc.type Artículo es_ES
dc.identifier.doi 10.1002/cem.2897 es_ES
dc.relation.projectID info:eu-repo/grantAgreement/MINECO//ECO2013-43353-R/ES/CREAR CAPITAL DE MARCA E INNOVAR A TRAVES DE LA RELACION: OPORTUNIDADES PARA LA EMPRESA TURISTICA MEDIANTE LOS AVANCES EN LAS TIC/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/MICINN//DPI2011-28112-C04-02/ES/MONITORIZACION, INFERENCIA, OPTIMIZACION Y CONTROL MULTI-ESCALA: DE CELULAS A BIORREACTORES. (MULTISCALES)/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/MINECO//DPI2014-55276-C5-1-R/ES/BIOLOGIA SINTETICA PARA LA MEJORA EN BIOPRODUCCION: DISEÑO, OPTIMIZACION, MONITORIZACION Y CONTROL/ es_ES
dc.rights.accessRights Abierto es_ES
dc.contributor.affiliation Universitat Politècnica de València. Departamento de Estadística e Investigación Operativa Aplicadas y Calidad - Departament d'Estadística i Investigació Operativa Aplicades i Qualitat es_ES
dc.description.bibliographicCitation Folch-Fortuny, A.; Arteaga, F.; Ferrer, A. (2017). PLS model building with missing data: New algorithms and a comparative study. Journal of Chemometrics. 31(7):1-12. https://doi.org/10.1002/cem.2897 es_ES
dc.description.accrualMethod S es_ES
dc.relation.publisherversion https://doi.org/10.1002/cem.2897 es_ES
dc.description.upvformatpinicio 1 es_ES
dc.description.upvformatpfin 12 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.description.volume 31 es_ES
dc.description.issue 7 es_ES
dc.relation.pasarela S\349804 es_ES
dc.contributor.funder Ministerio de Economía y Competitividad es_ES
dc.contributor.funder Ministerio de Ciencia e Innovación es_ES
dc.description.references Grung, B., & Manne, R. (1998). Missing values in principal component analysis. Chemometrics and Intelligent Laboratory Systems, 42(1-2), 125-139. doi:10.1016/s0169-7439(98)00031-8 es_ES
dc.description.references Arteaga, F., & Ferrer-Riquelme, A. J. (2009). Missing Data. Comprehensive Chemometrics, 285-314. doi:10.1016/b978-044452701-1.00125-3 es_ES
dc.description.references Folch-Fortuny, A., Arteaga, F., & Ferrer, A. (2015). PCA model building with missing data: New proposals and a comparative study. Chemometrics and Intelligent Laboratory Systems, 146, 77-88. doi:10.1016/j.chemolab.2015.05.006 es_ES
dc.description.references Arteaga, F., & Ferrer, A. (2002). Dealing with missing data in MSPC: several methods, different interpretations, some examples. Journal of Chemometrics, 16(8-10), 408-418. doi:10.1002/cem.750 es_ES
dc.description.references Arteaga, F., & Ferrer, A. (2005). Framework for regression-based missing data imputation methods in on-line MSPC. Journal of Chemometrics, 19(8), 439-447. doi:10.1002/cem.946 es_ES
dc.description.references Nelson, P. R. C., Taylor, P. A., & MacGregor, J. F. (1996). Missing data methods in PCA and PLS: Score calculations with incomplete observations. Chemometrics and Intelligent Laboratory Systems, 35(1), 45-65. doi:10.1016/s0169-7439(96)00007-x es_ES
dc.description.references Walczak, B., & Massart, D. L. (2001). Dealing with missing data. Chemometrics and Intelligent Laboratory Systems, 58(1), 15-27. doi:10.1016/s0169-7439(01)00131-9 es_ES
dc.description.references Schafer, J. L. (1997). Analysis of Incomplete Multivariate Data. doi:10.1201/9781439821862 es_ES
dc.description.references Folch-Fortuny, A., Arteaga, F., & Ferrer, A. (2016). Missing Data Imputation Toolbox for MATLAB. Chemometrics and Intelligent Laboratory Systems, 154, 93-100. doi:10.1016/j.chemolab.2016.03.019 es_ES
dc.description.references ProSensus Multivariate release 16.02 2016 es_ES
dc.description.references SIMCA release 14 2015 es_ES
dc.description.references The Unscrambler X Release 10.4 2016 es_ES
dc.description.references PLS_Toolbox Release 8.1 2016 es_ES
dc.description.references Liu, Y., & Brown, S. D. (2013). Comparison of five iterative imputation methods for multivariate classification. Chemometrics and Intelligent Laboratory Systems, 120, 106-115. doi:10.1016/j.chemolab.2012.11.010 es_ES
dc.description.references White, I. R., Royston, P., & Wood, A. M. (2010). Multiple imputation using chained equations: Issues and guidance for practice. Statistics in Medicine, 30(4), 377-399. doi:10.1002/sim.4067 es_ES
dc.description.references Schneider, T. (2001). Analysis of Incomplete Climate Data: Estimation of Mean Values and Covariance Matrices and Imputation of Missing Values. Journal of Climate, 14(5), 853-871. doi:10.1175/1520-0442(2001)014<0853:aoicde>2.0.co;2 es_ES
dc.description.references Fierro, R. D., Golub, G. H., Hansen, P. C., & O’Leary, D. P. (1997). Regularization by Truncated Total Least Squares. SIAM Journal on Scientific Computing, 18(4), 1223-1241. doi:10.1137/s1064827594263837 es_ES
dc.description.references Puwakkatiya-Kankanamage, E. H., García-Muñoz, S., & Biegler, L. T. (2014). An optimization-based undeflated PLS (OUPLS) method to handle missing data in the training set. Journal of Chemometrics, 28(7), 575-584. doi:10.1002/cem.2618 es_ES
dc.description.references Camacho, J., Picó, J., & Ferrer, A. (2008). Bilinear modelling of batch processes. Part II: a comparison of PLS soft-sensors. Journal of Chemometrics, 22(10), 533-547. doi:10.1002/cem.1179 es_ES
dc.description.references Geladi, P., & Kowalski, B. R. (1986). Partial least-squares regression: a tutorial. Analytica Chimica Acta, 185, 1-17. doi:10.1016/0003-2670(86)80028-9 es_ES
dc.description.references Kubinyi, H. (1996). Evolutionary variable selection in regression and PLS analyses. Journal of Chemometrics, 10(2), 119-133. doi:10.1002/(sici)1099-128x(199603)10:2<119::aid-cem409>3.0.co;2-4 es_ES
dc.description.references González-Martínez, J. M., Folch-Fortuny, A., Llaneras, F., Tortajada, M., Picó, J., & Ferrer, A. (2014). Metabolic flux understanding of Pichia pastoris grown on heterogenous culture media. Chemometrics and Intelligent Laboratory Systems, 134, 89-99. doi:10.1016/j.chemolab.2014.02.003 es_ES
dc.description.references Folch-Fortuny, A., Vitale, R., de Noord, O. E., & Ferrer, A. (2017). Calibration transfer between NIR spectrometers: New proposals and a comparative study. Journal of Chemometrics, 31(3), e2874. doi:10.1002/cem.2874 es_ES
dc.description.references Arteaga, F., & Ferrer, A. (2010). How to simulate normal data sets with the desired correlation structure. Chemometrics and Intelligent Laboratory Systems, 101(1), 38-42. doi:10.1016/j.chemolab.2009.12.003 es_ES
dc.description.references Arteaga, F., & Ferrer, A. (2013). Building covariance matrices with the desired structure. Chemometrics and Intelligent Laboratory Systems, 127, 80-88. doi:10.1016/j.chemolab.2013.06.003 es_ES
dc.description.references Folch-Fortuny, A., Arteaga, F., & Ferrer, A. (2016). Assessment of maximum likelihood PCA missing data imputation. Journal of Chemometrics, 30(7), 386-393. doi:10.1002/cem.2804 es_ES
dc.description.references Saccenti, E., & Camacho, J. (2015). On the use of the observation-wisek-fold operation in PCA cross-validation. Journal of Chemometrics, 29(8), 467-478. doi:10.1002/cem.2726 es_ES


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem