- -

Geometrical codification for clustering mixed categorical and numerical databases

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

Geometrical codification for clustering mixed categorical and numerical databases

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author Barceló Rico, Fátima es_ES
dc.contributor.author Diez, José-Luís es_ES
dc.date.accessioned 2017-12-28T07:57:42Z
dc.date.available 2017-12-28T07:57:42Z
dc.date.issued 2012 es_ES
dc.identifier.issn 0925-9902 es_ES
dc.identifier.uri http://hdl.handle.net/10251/93511
dc.description.abstract [EN] This paper presents an alternative to cluster mixed databases. The main idea is to propose a general method to cluster mixed data sets, which is not very complex and still can reach similar levels of performance of some good algorithms. The proposed approach is based on codifying the categorical attributes and use a numerical clustering algorithm on the resulting database. The codification proposed is based on polar or spherical coordinates, it is easy to understand and to apply, the increment in the length of the input matrix is not excessively large, and the codification error can be determined for each case. The proposed codification combined with the well known k-means algorithm showed a very good performance in different benchmarks and has been compared with both, other codifications and other mixed clustering algorithms, showing a better or comparable performance in all cases. es_ES
dc.description.sponsorship The authors acknowledge the partial funding of this work by the National projects DPI2007-66728-C02-01 and DPI2008-06737-C02-01. en_EN
dc.language Inglés es_ES
dc.publisher SPRINGER es_ES
dc.relation.ispartof Journal of Intelligent Information Systems es_ES
dc.rights Reserva de todos los derechos es_ES
dc.subject Clustering es_ES
dc.subject Codification error es_ES
dc.subject Data conversion es_ES
dc.subject k-means es_ES
dc.subject Mixed data es_ES
dc.subject Categorical attributes es_ES
dc.subject General method es_ES
dc.subject Input matrices es_ES
dc.subject k-Means algorithm es_ES
dc.subject Mixed database es_ES
dc.subject Spherical coordinates es_ES
dc.subject Benchmarking es_ES
dc.subject Database systems es_ES
dc.subject Clustering algorithms es_ES
dc.subject.classification INGENIERIA DE SISTEMAS Y AUTOMATICA es_ES
dc.title Geometrical codification for clustering mixed categorical and numerical databases es_ES
dc.type Artículo es_ES
dc.identifier.doi 10.1007/s10844-011-0187-y es_ES
dc.relation.projectID info:eu-repo/grantAgreement/MEC//DPI2007-66728-C02-01/ES/CONTROL DE GLUCEMIA EN LAZO CERRADO EN PACIENTES CON DIABETES MELLITUS 1 Y PACIENTES CRITICOS/ es_ES
dc.rights.accessRights Abierto es_ES
dc.contributor.affiliation Universitat Politècnica de València. Departamento de Ingeniería de Sistemas y Automática - Departament d'Enginyeria de Sistemes i Automàtica es_ES
dc.description.bibliographicCitation Barceló Rico, F.; Diez, J. (2012). Geometrical codification for clustering mixed categorical and numerical databases. Journal of Intelligent Information Systems. 39(1):167-185. https://doi.org/10.1007/s10844-011-0187-y es_ES
dc.description.accrualMethod S es_ES
dc.relation.publisherversion http://doi.org/10.1007/s10844-011-0187-y es_ES
dc.description.upvformatpinicio 167 es_ES
dc.description.upvformatpfin 185 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.description.volume 39 es_ES
dc.description.issue 1 es_ES
dc.relation.pasarela S\208967 es_ES
dc.contributor.funder Ministerio de Educación y Ciencia es_ES
dc.description.references Ahmad, A., & Dey, L. (2007). A k-mean clustering algorithm for mixed numeric and categorical data. Data & Knowledge Engineering, 63(2), 503–527. es_ES
dc.description.references Babuska, R. (1996). Fuzzy modeling and identification. PhD dissertation, Delft University of Technology, Delft, The Netherlands. es_ES
dc.description.references Barcelo-Rico, F., & Diez, J. L. (2009). Comparative study of codification techniques for clustering heart disease database. Modeling and Control in Biomedical Systems, 7(1), 64–69. es_ES
dc.description.references Bourke, P. (1993). http://local.wasp.uwa.edu.au/ . Accessed 30 July 2010. es_ES
dc.description.references Brouwer, R. K. (2007). A method for fuzzy clustering with ordinal attributes. International Journal of Intelligent Systems, 22, 590–620. es_ES
dc.description.references Coxeter, H. S. M. (1948). Regular polytopes. Methuen. es_ES
dc.description.references Crossa, J., & Franco, J. (2004). Statistical methods for classifying genotypes. Euphytica, 137(1), 19–37. es_ES
dc.description.references de Oliveira, J. V., & Pedrycz, W. (2007). Advances in fuzzy clustering and its applications. New York: Wiley. es_ES
dc.description.references Diez, J. L., Navarro, J. L., & Sala, A. (2004). Algoritmos de agrupamiento en la identificacion de modelos borrosos. Revista Iberoamericana de Automática e Informática Industrial, 1(2), 32–41 (in Spanish). es_ES
dc.description.references Diez, J. L., Sala, A., & Navarro, J. L. (2006). Target-shaped possibilistic clustering applied to local-model identification. Engineering Applications of Artificial Intelligence, 19, 201–208. es_ES
dc.description.references Duda, R. O., Hart, P. E., & Stork, D. G. (2000). Pattern classification. Ed. New York, USA: Wiley. es_ES
dc.description.references Gelbard, R., Goldman, O., & Spiegler, I. (2007). Investigating diversity of clustering methods: An empirical comparison. Data & Knowledge Engineering, 63(1), 155–166. es_ES
dc.description.references Goebel, M., & Gruenwald, L. (1999). A survey of data mining and knowledge discovery software tools. ACM SIGKDD (Explorations Newsletter), 1(1), 20–33. es_ES
dc.description.references Hartigan, J. A., & Wong, M. A. (1979). A K-means clustering algorithm. Journal of the Royal Statistical Society. Series C, 28, 100–108. es_ES
dc.description.references He, Z., Xu, X., & Deng, S. (2005). Scalable algorithms for clustering large datasets with mixed type attributes. International Journal of Intelligent Systems, 20, 1077–1089. es_ES
dc.description.references Hsu, C. C., Chen, C. L., & Su, Y. W. (2007). Hierarchical clustering of mixed data based on distance hierarchy. Information Sciences, 177(20), 4474–4492. es_ES
dc.description.references Huang, Z., & Ng, M. K. (1999). A fuzzy k-modes algorithm for clustering categorical data. IEEE Transactions on Fuzzy Systems, 7(4), 446–452. es_ES
dc.description.references Timm, H., & Kruse, R. (1998). Fuzzy cluster analysis with missing values. In Fuzzy information processing society—NAFIPS, 1998 conference of the North American (Vol. 1). es_ES
dc.description.references Zhang, T., Ramakrishnan, R., & Livny, M. (1996). Birch: An efficient data clustering method for large databases. In Proc. SIGmod, 96, 103–114. es_ES


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem