Mostrar el registro sencillo del ítem
dc.contributor.author | de Curtò, J. | es_ES |
dc.contributor.author | de Zarzà, I. | es_ES |
dc.contributor.author | Tavares De Araujo Cesariny Calafate, Carlos Miguel | es_ES |
dc.date.accessioned | 2024-10-23T18:08:22Z | |
dc.date.available | 2024-10-23T18:08:22Z | |
dc.date.issued | 2023-02 | es_ES |
dc.identifier.uri | http://hdl.handle.net/10251/210789 | |
dc.description.abstract | [EN] Unmanned Aerial Vehicles (UAVs) are able to provide instantaneous visual cues and a high-level data throughput that could be further leveraged to address complex tasks, such as semantically rich scene understanding. In this work, we built on the use of Large Language Models (LLMs) and Visual Language Models (VLMs), together with a state-of-the-art detection pipeline, to provide thorough zero-shot UAV scene literary text descriptions. The generated texts achieve a GUNNING Fog median grade level in the range of 7-12. Applications of this framework could be found in the filming industry and could enhance user experience in theme parks or in the advertisement sector. We demonstrate a low-cost highly efficient state-of-the-art practical implementation of microdrones in a well-controlled and challenging setting, in addition to proposing the use of standardized readability metrics to assess LLM-enhanced descriptions. | es_ES |
dc.description.sponsorship | This work is supported by the HK Innovation and Technology Commission (InnoHK Project CIMDA). We acknowledge the support of Universitat Politecnica de Valencia; R&D project PID2021-122580NB-I00, funded by MCIN/AEI/10.13039/501100011033 and ERDF. | es_ES |
dc.language | Inglés | es_ES |
dc.publisher | MDPI AG | es_ES |
dc.relation.ispartof | Drones | es_ES |
dc.rights | Reconocimiento (by) | es_ES |
dc.subject | Scene understanding | es_ES |
dc.subject | Large language models | es_ES |
dc.subject | Visual language models | es_ES |
dc.subject | CLIP | es_ES |
dc.subject | GPT-3 | es_ES |
dc.subject | YOLOv7 | es_ES |
dc.subject | UAV | es_ES |
dc.subject.classification | ARQUITECTURA Y TECNOLOGIA DE COMPUTADORES | es_ES |
dc.title | Semantic Scene Understanding with Large Language Models on Unmanned Aerial Vehicles | es_ES |
dc.type | Artículo | es_ES |
dc.identifier.doi | 10.3390/drones7020114 | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/PID2021-122580NB-I00/ES/SISTEMAS INTELIGENTES DE SENSORIZACION PARA ECOSISTEMAS, ESPACIOS URBANOS Y MOVILIDAD SOSTENIBLE/ | es_ES |
dc.rights.accessRights | Abierto | es_ES |
dc.contributor.affiliation | Universitat Politècnica de València. Escola Tècnica Superior d'Enginyeria Informàtica | es_ES |
dc.description.bibliographicCitation | De Curtò, J.; De Zarzà, I.; Tavares De Araujo Cesariny Calafate, CM. (2023). Semantic Scene Understanding with Large Language Models on Unmanned Aerial Vehicles. Drones. 7(2). https://doi.org/10.3390/drones7020114 | es_ES |
dc.description.accrualMethod | S | es_ES |
dc.relation.publisherversion | https://doi.org/10.3390/drones7020114 | es_ES |
dc.type.version | info:eu-repo/semantics/publishedVersion | es_ES |
dc.description.volume | 7 | es_ES |
dc.description.issue | 2 | es_ES |
dc.identifier.eissn | 2504-446X | es_ES |
dc.relation.pasarela | S\482432 | es_ES |
dc.contributor.funder | AGENCIA ESTATAL DE INVESTIGACION | es_ES |
dc.contributor.funder | European Regional Development Fund | es_ES |
dc.contributor.funder | Universitat Politècnica de València | es_ES |