- -

Summarization of Videos with the Signature Transform

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Compartir/Enviar a

Citas

Estadísticas

  • Estadisticas de Uso

Summarization of Videos with the Signature Transform

Mostrar el registro sencillo del ítem

Ficheros en el ítem

dc.contributor.author de Curtò-I Díaz, Joaquim es_ES
dc.contributor.author de Zarzà-I Cubero, Irene es_ES
dc.contributor.author Roig, Gemma es_ES
dc.contributor.author Tavares De Araujo Cesariny Calafate, Carlos Miguel es_ES
dc.date.accessioned 2024-04-30T18:06:39Z
dc.date.available 2024-04-30T18:06:39Z
dc.date.issued 2023-04 es_ES
dc.identifier.uri http://hdl.handle.net/10251/203874
dc.description.abstract [EN] This manuscript presents a new benchmark for assessing the quality of visual summaries without the need for human annotators. It is based on the Signature Transform, specifically focusing on the RMSE and the MAE Signature and Log-Signature metrics, and builds upon the assumption that uniform random sampling can offer accurate summarization capabilities. We provide a new dataset comprising videos from Youtube and their corresponding automatic audio transcriptions. Firstly, we introduce a preliminary baseline for automatic video summarization, which has at its core a Vision Transformer, an image¿text model pre-trained with Contrastive Language¿Image Pre-training (CLIP), as well as a module of object detection. Following that, we propose an accurate technique grounded in the harmonic components captured by the Signature Transform, which delivers compelling accuracy. The analytical measures are extensively evaluated, and we conclude that they strongly correlate with the notion of a good summary. es_ES
dc.description.sponsorship This work was supported by the HK Innovation and Technology Commission (InnoHK Project CIMDA). We acknowledge the support of Universitat Politècnica de València; R&D project PID2021-122580NB-I00, funded by MCIN/AEI/10.13039/501100011033 and ERDF. We thank the following funding sources from GOETHE-University Frankfurt am Main; DePP Dezentrale Plannung von Platoons im Straßengüterverkehr mit Hilfe einer KI auf Basis einzelner LKW and Center for Data Science & AI . es_ES
dc.language Inglés es_ES
dc.publisher MDPI AG es_ES
dc.relation.ispartof Electronics es_ES
dc.rights Reconocimiento (by) es_ES
dc.subject Video summarization es_ES
dc.subject Large language models es_ES
dc.subject Visual language models es_ES
dc.subject CLIP es_ES
dc.subject Signature transform es_ES
dc.subject.classification ARQUITECTURA Y TECNOLOGIA DE COMPUTADORES es_ES
dc.title Summarization of Videos with the Signature Transform es_ES
dc.type Artículo es_ES
dc.identifier.doi 10.3390/electronics12071735 es_ES
dc.relation.projectID info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/PID2021-122580NB-I00/ES/SISTEMAS INTELIGENTES DE SENSORIZACION PARA ECOSISTEMAS, ESPACIOS URBANOS Y MOVILIDAD SOSTENIBLE/ es_ES
dc.rights.accessRights Abierto es_ES
dc.contributor.affiliation Universitat Politècnica de València. Escola Tècnica Superior d'Enginyeria Informàtica es_ES
dc.description.bibliographicCitation De Curtò-I Díaz, J.; De Zarzà-I Cubero, I.; Roig, G.; Tavares De Araujo Cesariny Calafate, CM. (2023). Summarization of Videos with the Signature Transform. Electronics. 12(7). https://doi.org/10.3390/electronics12071735 es_ES
dc.description.accrualMethod S es_ES
dc.relation.publisherversion https://doi.org/10.3390/electronics12071735 es_ES
dc.type.version info:eu-repo/semantics/publishedVersion es_ES
dc.description.volume 12 es_ES
dc.description.issue 7 es_ES
dc.identifier.eissn 2079-9292 es_ES
dc.relation.pasarela S\487015 es_ES
dc.contributor.funder AGENCIA ESTATAL DE INVESTIGACION es_ES
dc.contributor.funder European Regional Development Fund es_ES
dc.contributor.funder Universitat Politècnica de València es_ES


Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem