- -

Evaluation of term-weighting measures for grouped text documents with a target variable: a simulation study

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

Evaluation of term-weighting measures for grouped text documents with a target variable: a simulation study

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author Ricciardi, Riccardo es_ES
dc.contributor.author Manisera, Marica es_ES
dc.date.accessioned 2024-01-11T09:06:59Z
dc.date.available 2024-01-11T09:06:59Z
dc.date.issued 2023-09-22
dc.identifier.isbn 9788413960869
dc.identifier.uri http://hdl.handle.net/10251/201768
dc.description.abstract [EN] In Text Mining applications, count-based models are often used to represent text documents. When two document variables are available, i.e. an outcome and a grouping variable, the weight of a word for the documents may depend on the group memberships. The contribution of this work is to frame this context with a statistical approach, by modelling the corpus of documents with a Multivariate Binomial distribution (Hudson et al., 1986). The advantage of this solution is two-fold: it allows (1) to review, in a statistical framework, some term-weighting measures used in the literature (Samant et al., 2019), and (2) to simulate corpora with predefined characteristics by means of the Gaussian Copula method (Genest and McKay, 1986). This simulation is useful to investigate the ability of the existing measures, computed on the group-word interaction, to capture both the group-word relationship itself and the target-word association. Results from the simulation study show interesting relationships that can be exploited by nice visualization tools. es_ES
dc.language Inglés es_ES
dc.publisher Editorial Universitat Politècnica de València es_ES
dc.relation.ispartof 5th International Conference on Advanced Research Methods and Analytics (CARMA 2023)
dc.rights Reconocimiento - No comercial - Compartir igual (by-nc-sa) es_ES
dc.subject Term-weighting measures es_ES
dc.subject Gaussian Copula es_ES
dc.subject Simulation es_ES
dc.title Evaluation of term-weighting measures for grouped text documents with a target variable: a simulation study es_ES
dc.type Capítulo de libro es_ES
dc.type Comunicación en congreso es_ES
dc.rights.accessRights Abierto es_ES
dc.description.bibliographicCitation Ricciardi, R.; Manisera, M. (2023). Evaluation of term-weighting measures for grouped text documents with a target variable: a simulation study. Editorial Universitat Politècnica de València. 97-98. http://hdl.handle.net/10251/201768 es_ES
dc.description.accrualMethod OCS es_ES
dc.relation.conferencename CARMA 2023 - 5th International Conference on Advanced Research Methods and Analytics es_ES
dc.relation.conferencedate Junio 28-30, 2023 es_ES
dc.relation.conferenceplace Sevilla, España es_ES
dc.relation.publisherversion http://ocs.editorial.upv.es/index.php/CARMA/CARMA2023/paper/view/16506 es_ES
dc.description.upvformatpinicio 97 es_ES
dc.description.upvformatpfin 98 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.relation.pasarela OCS\16506 es_ES


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem