- -

Exploring high-level features for detecting cyberpedophilia

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

Exploring high-level features for detecting cyberpedophilia

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author Bogdanova, Dasha es_ES
dc.contributor.author Rosso, Paolo es_ES
dc.contributor.author Solorio, Thamar es_ES
dc.date.accessioned 2015-04-28T12:35:18Z
dc.date.available 2015-04-28T12:35:18Z
dc.date.issued 2014-01
dc.identifier.issn 0885-2308
dc.identifier.uri http://hdl.handle.net/10251/49393
dc.description.abstract [EN] In this paper, we suggest a list of high-level features and study their applicability in detection of cyberpedophiles. We used a corpus of chats downloaded from http://www.perverted-justice.com and two negative datasets of different nature: cybersex logs available online, and the NPS chat corpus. The classification results show that the NPS data and the pedophiles’ conversations can be accurately discriminated from each other with character n-grams, while in the more complicated case of cybersex logs there is need for high-level features to reach good accuracy levels. In this latter setting our results show that features that model behaviour and emotion significantly outperform the low-level ones, and achieve a 97% accuracy. es_ES
dc.description.sponsorship The work of Dasha Bogdanova was partially carried out during the internship at the Universitat Politecnica de Valencia (scholarship of the University of St. Petersburg). Her research was partially supported by Google Research Award. The collaboration with Thamar Solorio was possible thanks to her one-month research visit at the Universitat Politecnica de Valencia (program PAID-PAID-02-11 award n. 1932). The research work of Paolo Rosso was done in the framework of the European Commission WIQ-EI Web Information Quality Evaluation Initiative (IRSES Grant No. 269180) project within the FP 7 Marie Curie People, the DIANA-APPLICATIONS - Finding Hidden Knowledge in Texts: Applications (TIN2012-38603-0O2-01) project, and the VLC/CAMPUS Microcluster on Multimodal Interaction in Intelligent Systems. en_EN
dc.language Inglés es_ES
dc.publisher Elsevier es_ES
dc.relation.ispartof Computer Speech and Language es_ES
dc.rights Reserva de todos los derechos es_ES
dc.subject Cyberpedophilia es_ES
dc.subject Sentiment analysis es_ES
dc.subject Emotion detection es_ES
dc.subject.classification LENGUAJES Y SISTEMAS INFORMATICOS es_ES
dc.title Exploring high-level features for detecting cyberpedophilia es_ES
dc.type Artículo es_ES
dc.identifier.doi 10.1016/j.csl.2013.04.007
dc.relation.projectID info:eu-repo/grantAgreement/EC/FP7/269180/EU/Web Information Quality Evaluation Initiative/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/UPV//PAID-02-11-1932/ es_ES
dc.relation.projectID info:eu-repo/grantAgreement/MINECO//TIN2012-38603-C02-01/ES/DIANA-APPLICATIONS: FINDING HIDDEN KNOWLEDGE IN TEXTS: APPLICATIONS/ es_ES
dc.rights.accessRights Abierto es_ES
dc.contributor.affiliation Universitat Politècnica de València. Departamento de Sistemas Informáticos y Computación - Departament de Sistemes Informàtics i Computació es_ES
dc.description.bibliographicCitation Bogdanova, D.; Rosso, P.; Solorio, T. (2014). Exploring high-level features for detecting cyberpedophilia. Computer Speech and Language. 28(1):108-120. https://doi.org/10.1016/j.csl.2013.04.007 es_ES
dc.description.accrualMethod S es_ES
dc.relation.publisherversion http://dx.doi.org/10.1016/j.csl.2013.04.007 es_ES
dc.description.upvformatpinicio 108 es_ES
dc.description.upvformatpfin 120 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.description.volume 28 es_ES
dc.description.issue 1 es_ES
dc.relation.senia 285884
dc.contributor.funder European Commission
dc.contributor.funder Universitat Politècnica de València
dc.contributor.funder Ministerio de Economía y Competitividad


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem