- -

Group linear algorithm with sparse principal decomposition: a variable selection and clustering method for generalized linear models

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

Group linear algorithm with sparse principal decomposition: a variable selection and clustering method for generalized linear models

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author Laria, Juan C. es_ES
dc.contributor.author Aguilera-Morillo, M. Carmen es_ES
dc.contributor.author Lillo, Rosa E. es_ES
dc.date.accessioned 2023-10-06T18:00:58Z
dc.date.available 2023-10-06T18:00:58Z
dc.date.issued 2022-02 es_ES
dc.identifier.issn 0932-5026 es_ES
dc.identifier.uri http://hdl.handle.net/10251/197838
dc.description.abstract [EN] This paper introduces the Group Linear Algorithm with Sparse Principal decomposition, an algorithm for supervised variable selection and clustering. Our approach extends the Sparse Group Lasso regularization to calculate clusters as part of the model fit. Therefore, unlike Sparse Group Lasso, our idea does not require prior specification of clusters between variables. To determine the clusters, we solve a particular case of sparse Singular Value Decomposition, with a regularization term that follows naturally from the Group Lasso penalty. Moreover, this paper proposes a unified implementation to deal with, but not limited to, linear regression, logistic regression, and proportional hazards models with right-censoring. Our methodology is evaluated using both biological and simulated data, and details of the implementation in R and hyperparameter search are discussed. es_ES
dc.language Inglés es_ES
dc.publisher Springer-Verlag es_ES
dc.relation.ispartof Statistical Papers es_ES
dc.rights Reserva de todos los derechos es_ES
dc.subject Regression es_ES
dc.subject Classification es_ES
dc.subject Feature clustering es_ES
dc.subject Statistical computing es_ES
dc.subject.classification ESTADISTICA E INVESTIGACION OPERATIVA es_ES
dc.title Group linear algorithm with sparse principal decomposition: a variable selection and clustering method for generalized linear models es_ES
dc.type Artículo es_ES
dc.identifier.doi 10.1007/s00362-022-01313-z es_ES
dc.relation.projectID info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2019-104901RB-I00/ES/NUEVAS ESTRATEGIAS EN REGRESION PENALIZADA CON APLICACIONES EN SALUD, DEMOGRAFIA Y ECONOMIA/ es_ES
dc.rights.accessRights Abierto es_ES
dc.contributor.affiliation Universitat Politècnica de València. Escuela Técnica Superior de Ingenieros Industriales - Escola Tècnica Superior d'Enginyers Industrials es_ES
dc.description.bibliographicCitation Laria, JC.; Aguilera-Morillo, MC.; Lillo, RE. (2022). Group linear algorithm with sparse principal decomposition: a variable selection and clustering method for generalized linear models. Statistical Papers. 64(1):227-253. https://doi.org/10.1007/s00362-022-01313-z es_ES
dc.description.accrualMethod S es_ES
dc.relation.publisherversion https://doi.org/10.1007/s00362-022-01313-z es_ES
dc.description.upvformatpinicio 227 es_ES
dc.description.upvformatpfin 253 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.description.volume 64 es_ES
dc.description.issue 1 es_ES
dc.relation.pasarela S\476636 es_ES
dc.contributor.funder Agencia Estatal de Investigación es_ES
dc.description.references Alizadeh AA, Eisen MB, Davis RE, Ma C, Lossos IS, Rosenwald A, Boldrick JC, Sabet H, Tran T, Yu X et al (2000) Distinct types of diffuse large b-cell lymphoma identified by gene expression profiling. Nature 403(6769):503–511 es_ES
dc.description.references Bair E, Hastie T, Paul D, Tibshirani R (2006) Prediction by supervised principal components. J Am Stat Assoc 101(473):119–137 es_ES
dc.description.references Beck A, Teboulle M (2009) A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J Imag Sci 2(1):183–202 es_ES
dc.description.references Beisser D, Klau GW, Dandekar T, Müller T, Dittrich MT (2010) Bionet: an r-package for the functional analysis of biological networks. Bioinformatics 26(8):1129–1130 es_ES
dc.description.references Bergstra J, Bengio Y (2012) Random search for hyper-parameter optimization. J Mach Learn Res 13(Feb):281–305 es_ES
dc.description.references Bühlmann P, Rütimann P, van de Geer S, Zhang CH (2013) Correlated variables in regression: clustering and sparse estimation. J Stat Plan Inference 143(11):1835–1858 es_ES
dc.description.references Chen K, Chen K, Müller HG, Wang JL (2011) Stringing high-dimensional data for functional analysis. J Am Stat Assoc 106(493):275–284 es_ES
dc.description.references Ciuperca G (2020) Adaptive elastic-net selection in a quantile model with diverging number of variable groups. Statistics 54(5):1147–1170 es_ES
dc.description.references Dittrich MT, Klau GW, Rosenwald A, Dandekar T, Müller T (2008) Identifying functional modules in protein-protein interaction networks: an integrated exact approach. Bioinformatics 24(13):i223–i231 es_ES
dc.description.references Eddelbuettel D, François R (2011) Rcpp: seamless R and C++ integration. J Stat Softw 40(8):1–18 es_ES
dc.description.references Friedman J, Hastie T, Tibshirani R (2010a) A note on the group lasso and a sparse group lasso. arXiv preprint arXiv:1001.0736 es_ES
dc.description.references Friedman J, Hastie T, Tibshirani R (2010b) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1 es_ES
dc.description.references Kuhn M (2020) tune: Tidy Tuning Tools. https://CRAN.R-project.org/package=tune, r package version 0.1.0 es_ES
dc.description.references Kuhn M, Vaughan D (2020) parsnip: a Common API to Modeling and Analysis Functions. https://CRAN.R-project.org/package=parsnip, r package version 0.0.5 es_ES
dc.description.references Laria JC, Carmen Aguilera-Morillo M, Lillo RE (2019) An iterative sparse-group lasso. J Comput Graph Stat 28(3):722–731 es_ES
dc.description.references Luo S, Chen Z (2020) Feature selection by canonical correlation search in high-dimensional multiresponse models with complex group structures. J Am Stat Assoc 115(531):1227–1235 es_ES
dc.description.references Moore DF (2016) Applied survival analysis using R. Springer, New York es_ES
dc.description.references Ndiaye E, Fercoq O, Gramfort A, Salmon J (2016) Gap safe screening rules for sparse-group lasso. In: Advances in Neural Information Processing Systems, pp 388–396 es_ES
dc.description.references Price BS, Sherwood B (2017) A cluster elastic net for multivariate regression. J Mach Learn Res 18(1):8685–8723 es_ES
dc.description.references Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66(336):846–850 es_ES
dc.description.references Ren S, Kang EL, Lu JL (2020) Mcen: a method of simultaneous variable selection and clustering for high-dimensional multinomial regression. Stat Comput 30(2):291–304 es_ES
dc.description.references Rosenwald A, Wright G, Chan WC, Connors JM, Campo E, Fisher RI, Gascoyne RD, Muller-Hermelink HK, Smeland EB, Giltnane JM et al (2002) The use of molecular profiling to predict survival after chemotherapy for diffuse large-b-cell lymphoma. N Engl J Med 346(25):1937–1947 es_ES
dc.description.references Shen H, Huang JZ (2008) Sparse principal component analysis via regularized low rank matrix approximation. J Multivar Anal 99(6):1015–1034 es_ES
dc.description.references Simon N, Friedman J, Hastie T, Tibshirani R (2013) A sparse-group lasso. J Comput Graph Stat 22(2):231–245 es_ES
dc.description.references Snoek J, Larochelle H, Adams RP (2012) Practical bayesian optimization of machine learning algorithms. In: Advances in Neural Information Processing Systems, pp 2951–2959 es_ES
dc.description.references Therneau TM (2015) A package for survival analysis in S. https://CRAN.R-project.org/package=survival, version 2.38 es_ES
dc.description.references Therneau TM, Grambsch PM (2000) Modeling survival data: extending the cox model. Springer, New York es_ES
dc.description.references Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc 58(1):267–288 es_ES
dc.description.references Tibshirani R, Bien J, Friedman J, Hastie T, Simon N, Taylor J, Tibshirani RJ (2012) Strong rules for discarding predictors in lasso-type problems. J R Stat Soc Ser B 74(2):245–266 es_ES
dc.description.references Witten DM, Shojaie A, Zhang F (2014) The cluster elastic net for high-dimensional regression with unknown variable grouping. Technometrics 56(1):112–122 es_ES
dc.description.references Zhang Y, Zhang N, Sun D, Toh KC (2020) An efficient hessian based algorithm for solving large-scale sparse group lasso problems. Math Program 179(1):223–263 es_ES
dc.description.references Zhao H, Wu Q, Li G, Sun J (2019) Simultaneous estimation and variable selection for interval-censored data with broken adaptive ridge regression. J Am Stat Assoc 1–13 es_ES
dc.description.references Zhou N, Zhu J (2010) Group variable selection via a hierarchical lasso and its oracle property. Stat Interface 3:557–574 es_ES
dc.description.references Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Ser B 67(2):301–320 es_ES


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem