Group linear algorithm with sparse principal decomposition: a variable selection and clustering method for generalized linear models

Laria, Juan C.; Aguilera-Morillo, M. Carmen; Lillo, Rosa E.

doi:10.1007/s00362-022-01313-z

Identificarse

Buscar en RiuNet

Listar

Todo RiuNet
Esta colección

Mi cuenta

Acceder

Estadísticas

Ver Estadísticas de uso

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Group linear algorithm with sparse principal decomposition: a variable selection and clustering method for generalized linear models

Mostrar el registro sencillo del ítem

Ficheros en el ítem

Nombre: Group linear algorithm ...

Tamaño: 716.6Kb

Formato: PDF

Descripción: Versión del Autor.

Abrir

Nombre: LariaAguilera-Mor ...

Tamaño: 663.7Kb

Formato: PDF

Descripción: Versión editorial

Solicitar una copia al autor

dc.contributor.author	Laria, Juan C.	es_ES
dc.contributor.author	Aguilera-Morillo, M. Carmen	es_ES
dc.contributor.author	Lillo, Rosa E.	es_ES
dc.date.accessioned	2023-10-06T18:00:58Z
dc.date.available	2023-10-06T18:00:58Z
dc.date.issued	2022-02	es_ES
dc.identifier.issn	0932-5026	es_ES
dc.identifier.uri	http://hdl.handle.net/10251/197838
dc.description.abstract	[EN] This paper introduces the Group Linear Algorithm with Sparse Principal decomposition, an algorithm for supervised variable selection and clustering. Our approach extends the Sparse Group Lasso regularization to calculate clusters as part of the model fit. Therefore, unlike Sparse Group Lasso, our idea does not require prior specification of clusters between variables. To determine the clusters, we solve a particular case of sparse Singular Value Decomposition, with a regularization term that follows naturally from the Group Lasso penalty. Moreover, this paper proposes a unified implementation to deal with, but not limited to, linear regression, logistic regression, and proportional hazards models with right-censoring. Our methodology is evaluated using both biological and simulated data, and details of the implementation in R and hyperparameter search are discussed.	es_ES
dc.language	Inglés	es_ES
dc.publisher	Springer-Verlag	es_ES
dc.relation.ispartof	Statistical Papers	es_ES
dc.rights	Reserva de todos los derechos	es_ES
dc.subject	Regression	es_ES
dc.subject	Classification	es_ES
dc.subject	Feature clustering	es_ES
dc.subject	Statistical computing	es_ES
dc.subject.classification	ESTADISTICA E INVESTIGACION OPERATIVA	es_ES
dc.title	Group linear algorithm with sparse principal decomposition: a variable selection and clustering method for generalized linear models	es_ES
dc.type	Artículo	es_ES
dc.identifier.doi	10.1007/s00362-022-01313-z	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2019-104901RB-I00/ES/NUEVAS ESTRATEGIAS EN REGRESION PENALIZADA CON APLICACIONES EN SALUD, DEMOGRAFIA Y ECONOMIA/	es_ES
dc.rights.accessRights	Abierto	es_ES
dc.contributor.affiliation	Universitat Politècnica de València. Escuela Técnica Superior de Ingenieros Industriales - Escola Tècnica Superior d'Enginyers Industrials	es_ES
dc.description.bibliographicCitation	Laria, JC.; Aguilera-Morillo, MC.; Lillo, RE. (2022). Group linear algorithm with sparse principal decomposition: a variable selection and clustering method for generalized linear models. Statistical Papers. 64(1):227-253. https://doi.org/10.1007/s00362-022-01313-z	es_ES
dc.description.accrualMethod	S	es_ES
dc.relation.publisherversion	https://doi.org/10.1007/s00362-022-01313-z	es_ES
dc.description.upvformatpinicio	227	es_ES
dc.description.upvformatpfin	253	es_ES
dc.type.version	info:eu-repo/semantics/publishedVersion	es_ES
dc.description.volume	64	es_ES
dc.description.issue	1	es_ES
dc.relation.pasarela	S\476636	es_ES
dc.contributor.funder	Agencia Estatal de Investigación	es_ES
dc.description.references	Alizadeh AA, Eisen MB, Davis RE, Ma C, Lossos IS, Rosenwald A, Boldrick JC, Sabet H, Tran T, Yu X et al (2000) Distinct types of diffuse large b-cell lymphoma identified by gene expression profiling. Nature 403(6769):503–511	es_ES
dc.description.references	Bair E, Hastie T, Paul D, Tibshirani R (2006) Prediction by supervised principal components. J Am Stat Assoc 101(473):119–137	es_ES
dc.description.references	Beck A, Teboulle M (2009) A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J Imag Sci 2(1):183–202	es_ES
dc.description.references	Beisser D, Klau GW, Dandekar T, Müller T, Dittrich MT (2010) Bionet: an r-package for the functional analysis of biological networks. Bioinformatics 26(8):1129–1130	es_ES
dc.description.references	Bergstra J, Bengio Y (2012) Random search for hyper-parameter optimization. J Mach Learn Res 13(Feb):281–305	es_ES
dc.description.references	Bühlmann P, Rütimann P, van de Geer S, Zhang CH (2013) Correlated variables in regression: clustering and sparse estimation. J Stat Plan Inference 143(11):1835–1858	es_ES
dc.description.references	Chen K, Chen K, Müller HG, Wang JL (2011) Stringing high-dimensional data for functional analysis. J Am Stat Assoc 106(493):275–284	es_ES
dc.description.references	Ciuperca G (2020) Adaptive elastic-net selection in a quantile model with diverging number of variable groups. Statistics 54(5):1147–1170	es_ES
dc.description.references	Dittrich MT, Klau GW, Rosenwald A, Dandekar T, Müller T (2008) Identifying functional modules in protein-protein interaction networks: an integrated exact approach. Bioinformatics 24(13):i223–i231	es_ES
dc.description.references	Eddelbuettel D, François R (2011) Rcpp: seamless R and C++ integration. J Stat Softw 40(8):1–18	es_ES
dc.description.references	Friedman J, Hastie T, Tibshirani R (2010a) A note on the group lasso and a sparse group lasso. arXiv preprint arXiv:1001.0736	es_ES
dc.description.references	Friedman J, Hastie T, Tibshirani R (2010b) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1	es_ES
dc.description.references	Kuhn M (2020) tune: Tidy Tuning Tools. https://CRAN.R-project.org/package=tune, r package version 0.1.0	es_ES
dc.description.references	Kuhn M, Vaughan D (2020) parsnip: a Common API to Modeling and Analysis Functions. https://CRAN.R-project.org/package=parsnip, r package version 0.0.5	es_ES
dc.description.references	Laria JC, Carmen Aguilera-Morillo M, Lillo RE (2019) An iterative sparse-group lasso. J Comput Graph Stat 28(3):722–731	es_ES
dc.description.references	Luo S, Chen Z (2020) Feature selection by canonical correlation search in high-dimensional multiresponse models with complex group structures. J Am Stat Assoc 115(531):1227–1235	es_ES
dc.description.references	Moore DF (2016) Applied survival analysis using R. Springer, New York	es_ES
dc.description.references	Ndiaye E, Fercoq O, Gramfort A, Salmon J (2016) Gap safe screening rules for sparse-group lasso. In: Advances in Neural Information Processing Systems, pp 388–396	es_ES
dc.description.references	Price BS, Sherwood B (2017) A cluster elastic net for multivariate regression. J Mach Learn Res 18(1):8685–8723	es_ES
dc.description.references	Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66(336):846–850	es_ES
dc.description.references	Ren S, Kang EL, Lu JL (2020) Mcen: a method of simultaneous variable selection and clustering for high-dimensional multinomial regression. Stat Comput 30(2):291–304	es_ES
dc.description.references	Rosenwald A, Wright G, Chan WC, Connors JM, Campo E, Fisher RI, Gascoyne RD, Muller-Hermelink HK, Smeland EB, Giltnane JM et al (2002) The use of molecular profiling to predict survival after chemotherapy for diffuse large-b-cell lymphoma. N Engl J Med 346(25):1937–1947	es_ES
dc.description.references	Shen H, Huang JZ (2008) Sparse principal component analysis via regularized low rank matrix approximation. J Multivar Anal 99(6):1015–1034	es_ES
dc.description.references	Simon N, Friedman J, Hastie T, Tibshirani R (2013) A sparse-group lasso. J Comput Graph Stat 22(2):231–245	es_ES
dc.description.references	Snoek J, Larochelle H, Adams RP (2012) Practical bayesian optimization of machine learning algorithms. In: Advances in Neural Information Processing Systems, pp 2951–2959	es_ES
dc.description.references	Therneau TM (2015) A package for survival analysis in S. https://CRAN.R-project.org/package=survival, version 2.38	es_ES
dc.description.references	Therneau TM, Grambsch PM (2000) Modeling survival data: extending the cox model. Springer, New York	es_ES
dc.description.references	Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc 58(1):267–288	es_ES
dc.description.references	Tibshirani R, Bien J, Friedman J, Hastie T, Simon N, Taylor J, Tibshirani RJ (2012) Strong rules for discarding predictors in lasso-type problems. J R Stat Soc Ser B 74(2):245–266	es_ES
dc.description.references	Witten DM, Shojaie A, Zhang F (2014) The cluster elastic net for high-dimensional regression with unknown variable grouping. Technometrics 56(1):112–122	es_ES
dc.description.references	Zhang Y, Zhang N, Sun D, Toh KC (2020) An efficient hessian based algorithm for solving large-scale sparse group lasso problems. Math Program 179(1):223–263	es_ES
dc.description.references	Zhao H, Wu Q, Li G, Sun J (2019) Simultaneous estimation and variable selection for interval-censored data with broken adaptive ridge regression. J Am Stat Assoc 1–13	es_ES
dc.description.references	Zhou N, Zhu J (2010) Group variable selection via a hierarchical lasso and its oracle property. Stat Interface 3:557–574	es_ES
dc.description.references	Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Ser B 67(2):301–320	es_ES

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem

Group linear algorithm with sparse principal decomposition: a variable selection and clustering method for generalized linear models

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Buscar en RiuNet

Listar

Todo RiuNet

Esta colección

Mi cuenta

Estadísticas

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Group linear algorithm with sparse principal decomposition: a variable selection and clustering method for generalized linear models

Ficheros en el ítem

Este ítem aparece en la(s) siguiente(s) colección(ones)