Mostrar el registro sencillo del ítem
dc.contributor.author | Laria, Juan C.![]() |
es_ES |
dc.contributor.author | Aguilera-Morillo, M. Carmen![]() |
es_ES |
dc.contributor.author | Lillo, Rosa E.![]() |
es_ES |
dc.date.accessioned | 2023-10-06T18:00:58Z | |
dc.date.available | 2023-10-06T18:00:58Z | |
dc.date.issued | 2022-02 | es_ES |
dc.identifier.issn | 0932-5026 | es_ES |
dc.identifier.uri | http://hdl.handle.net/10251/197838 | |
dc.description.abstract | [EN] This paper introduces the Group Linear Algorithm with Sparse Principal decomposition, an algorithm for supervised variable selection and clustering. Our approach extends the Sparse Group Lasso regularization to calculate clusters as part of the model fit. Therefore, unlike Sparse Group Lasso, our idea does not require prior specification of clusters between variables. To determine the clusters, we solve a particular case of sparse Singular Value Decomposition, with a regularization term that follows naturally from the Group Lasso penalty. Moreover, this paper proposes a unified implementation to deal with, but not limited to, linear regression, logistic regression, and proportional hazards models with right-censoring. Our methodology is evaluated using both biological and simulated data, and details of the implementation in R and hyperparameter search are discussed. | es_ES |
dc.language | Inglés | es_ES |
dc.publisher | Springer-Verlag | es_ES |
dc.relation.ispartof | Statistical Papers | es_ES |
dc.rights | Reserva de todos los derechos | es_ES |
dc.subject | Regression | es_ES |
dc.subject | Classification | es_ES |
dc.subject | Feature clustering | es_ES |
dc.subject | Statistical computing | es_ES |
dc.subject.classification | ESTADISTICA E INVESTIGACION OPERATIVA | es_ES |
dc.title | Group linear algorithm with sparse principal decomposition: a variable selection and clustering method for generalized linear models | es_ES |
dc.type | Artículo | es_ES |
dc.identifier.doi | 10.1007/s00362-022-01313-z | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2019-104901RB-I00/ES/NUEVAS ESTRATEGIAS EN REGRESION PENALIZADA CON APLICACIONES EN SALUD, DEMOGRAFIA Y ECONOMIA/ | es_ES |
dc.rights.accessRights | Abierto | es_ES |
dc.contributor.affiliation | Universitat Politècnica de València. Escuela Técnica Superior de Ingenieros Industriales - Escola Tècnica Superior d'Enginyers Industrials | es_ES |
dc.description.bibliographicCitation | Laria, JC.; Aguilera-Morillo, MC.; Lillo, RE. (2022). Group linear algorithm with sparse principal decomposition: a variable selection and clustering method for generalized linear models. Statistical Papers. 64(1):227-253. https://doi.org/10.1007/s00362-022-01313-z | es_ES |
dc.description.accrualMethod | S | es_ES |
dc.relation.publisherversion | https://doi.org/10.1007/s00362-022-01313-z | es_ES |
dc.description.upvformatpinicio | 227 | es_ES |
dc.description.upvformatpfin | 253 | es_ES |
dc.type.version | info:eu-repo/semantics/publishedVersion | es_ES |
dc.description.volume | 64 | es_ES |
dc.description.issue | 1 | es_ES |
dc.relation.pasarela | S\476636 | es_ES |
dc.contributor.funder | Agencia Estatal de Investigación | es_ES |
dc.description.references | Alizadeh AA, Eisen MB, Davis RE, Ma C, Lossos IS, Rosenwald A, Boldrick JC, Sabet H, Tran T, Yu X et al (2000) Distinct types of diffuse large b-cell lymphoma identified by gene expression profiling. Nature 403(6769):503–511 | es_ES |
dc.description.references | Bair E, Hastie T, Paul D, Tibshirani R (2006) Prediction by supervised principal components. J Am Stat Assoc 101(473):119–137 | es_ES |
dc.description.references | Beck A, Teboulle M (2009) A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J Imag Sci 2(1):183–202 | es_ES |
dc.description.references | Beisser D, Klau GW, Dandekar T, Müller T, Dittrich MT (2010) Bionet: an r-package for the functional analysis of biological networks. Bioinformatics 26(8):1129–1130 | es_ES |
dc.description.references | Bergstra J, Bengio Y (2012) Random search for hyper-parameter optimization. J Mach Learn Res 13(Feb):281–305 | es_ES |
dc.description.references | Bühlmann P, Rütimann P, van de Geer S, Zhang CH (2013) Correlated variables in regression: clustering and sparse estimation. J Stat Plan Inference 143(11):1835–1858 | es_ES |
dc.description.references | Chen K, Chen K, Müller HG, Wang JL (2011) Stringing high-dimensional data for functional analysis. J Am Stat Assoc 106(493):275–284 | es_ES |
dc.description.references | Ciuperca G (2020) Adaptive elastic-net selection in a quantile model with diverging number of variable groups. Statistics 54(5):1147–1170 | es_ES |
dc.description.references | Dittrich MT, Klau GW, Rosenwald A, Dandekar T, Müller T (2008) Identifying functional modules in protein-protein interaction networks: an integrated exact approach. Bioinformatics 24(13):i223–i231 | es_ES |
dc.description.references | Eddelbuettel D, François R (2011) Rcpp: seamless R and C++ integration. J Stat Softw 40(8):1–18 | es_ES |
dc.description.references | Friedman J, Hastie T, Tibshirani R (2010a) A note on the group lasso and a sparse group lasso. arXiv preprint arXiv:1001.0736 | es_ES |
dc.description.references | Friedman J, Hastie T, Tibshirani R (2010b) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1 | es_ES |
dc.description.references | Kuhn M (2020) tune: Tidy Tuning Tools. https://CRAN.R-project.org/package=tune, r package version 0.1.0 | es_ES |
dc.description.references | Kuhn M, Vaughan D (2020) parsnip: a Common API to Modeling and Analysis Functions. https://CRAN.R-project.org/package=parsnip, r package version 0.0.5 | es_ES |
dc.description.references | Laria JC, Carmen Aguilera-Morillo M, Lillo RE (2019) An iterative sparse-group lasso. J Comput Graph Stat 28(3):722–731 | es_ES |
dc.description.references | Luo S, Chen Z (2020) Feature selection by canonical correlation search in high-dimensional multiresponse models with complex group structures. J Am Stat Assoc 115(531):1227–1235 | es_ES |
dc.description.references | Moore DF (2016) Applied survival analysis using R. Springer, New York | es_ES |
dc.description.references | Ndiaye E, Fercoq O, Gramfort A, Salmon J (2016) Gap safe screening rules for sparse-group lasso. In: Advances in Neural Information Processing Systems, pp 388–396 | es_ES |
dc.description.references | Price BS, Sherwood B (2017) A cluster elastic net for multivariate regression. J Mach Learn Res 18(1):8685–8723 | es_ES |
dc.description.references | Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66(336):846–850 | es_ES |
dc.description.references | Ren S, Kang EL, Lu JL (2020) Mcen: a method of simultaneous variable selection and clustering for high-dimensional multinomial regression. Stat Comput 30(2):291–304 | es_ES |
dc.description.references | Rosenwald A, Wright G, Chan WC, Connors JM, Campo E, Fisher RI, Gascoyne RD, Muller-Hermelink HK, Smeland EB, Giltnane JM et al (2002) The use of molecular profiling to predict survival after chemotherapy for diffuse large-b-cell lymphoma. N Engl J Med 346(25):1937–1947 | es_ES |
dc.description.references | Shen H, Huang JZ (2008) Sparse principal component analysis via regularized low rank matrix approximation. J Multivar Anal 99(6):1015–1034 | es_ES |
dc.description.references | Simon N, Friedman J, Hastie T, Tibshirani R (2013) A sparse-group lasso. J Comput Graph Stat 22(2):231–245 | es_ES |
dc.description.references | Snoek J, Larochelle H, Adams RP (2012) Practical bayesian optimization of machine learning algorithms. In: Advances in Neural Information Processing Systems, pp 2951–2959 | es_ES |
dc.description.references | Therneau TM (2015) A package for survival analysis in S. https://CRAN.R-project.org/package=survival, version 2.38 | es_ES |
dc.description.references | Therneau TM, Grambsch PM (2000) Modeling survival data: extending the cox model. Springer, New York | es_ES |
dc.description.references | Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc 58(1):267–288 | es_ES |
dc.description.references | Tibshirani R, Bien J, Friedman J, Hastie T, Simon N, Taylor J, Tibshirani RJ (2012) Strong rules for discarding predictors in lasso-type problems. J R Stat Soc Ser B 74(2):245–266 | es_ES |
dc.description.references | Witten DM, Shojaie A, Zhang F (2014) The cluster elastic net for high-dimensional regression with unknown variable grouping. Technometrics 56(1):112–122 | es_ES |
dc.description.references | Zhang Y, Zhang N, Sun D, Toh KC (2020) An efficient hessian based algorithm for solving large-scale sparse group lasso problems. Math Program 179(1):223–263 | es_ES |
dc.description.references | Zhao H, Wu Q, Li G, Sun J (2019) Simultaneous estimation and variable selection for interval-censored data with broken adaptive ridge regression. J Am Stat Assoc 1–13 | es_ES |
dc.description.references | Zhou N, Zhu J (2010) Group variable selection via a hierarchical lasso and its oracle property. Stat Interface 3:557–574 | es_ES |
dc.description.references | Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Ser B 67(2):301–320 | es_ES |