Semantic Scene Understanding with Large Language Models on Unmanned Aerial Vehicles

de Curtò, J.; de Zarzà, I.; Tavares De Araujo Cesariny Calafate, Carlos Miguel

doi:10.3390/drones7020114

Identificarse

Buscar en RiuNet

Listar

Todo RiuNet
Esta colección

Mi cuenta

Acceder

Estadísticas

Ver Estadísticas de uso

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Semantic Scene Understanding with Large Language Models on Unmanned Aerial Vehicles

Mostrar el registro sencillo del ítem

Ficheros en el ítem

Nombre: dedeTavares - ...

Tamaño: 8.607Mb

Formato: PDF

Descripción: Versión editorial

Abrir

dc.contributor.author	de Curtò, J.	es_ES
dc.contributor.author	de Zarzà, I.	es_ES
dc.contributor.author	Tavares De Araujo Cesariny Calafate, Carlos Miguel	es_ES
dc.date.accessioned	2024-10-23T18:08:22Z
dc.date.available	2024-10-23T18:08:22Z
dc.date.issued	2023-02	es_ES
dc.identifier.uri	http://hdl.handle.net/10251/210789
dc.description.abstract	[EN] Unmanned Aerial Vehicles (UAVs) are able to provide instantaneous visual cues and a high-level data throughput that could be further leveraged to address complex tasks, such as semantically rich scene understanding. In this work, we built on the use of Large Language Models (LLMs) and Visual Language Models (VLMs), together with a state-of-the-art detection pipeline, to provide thorough zero-shot UAV scene literary text descriptions. The generated texts achieve a GUNNING Fog median grade level in the range of 7-12. Applications of this framework could be found in the filming industry and could enhance user experience in theme parks or in the advertisement sector. We demonstrate a low-cost highly efficient state-of-the-art practical implementation of microdrones in a well-controlled and challenging setting, in addition to proposing the use of standardized readability metrics to assess LLM-enhanced descriptions.	es_ES
dc.description.sponsorship	This work is supported by the HK Innovation and Technology Commission (InnoHK Project CIMDA). We acknowledge the support of Universitat Politecnica de Valencia; R&D project PID2021-122580NB-I00, funded by MCIN/AEI/10.13039/501100011033 and ERDF.	es_ES
dc.language	Inglés	es_ES
dc.publisher	MDPI AG	es_ES
dc.relation.ispartof	Drones	es_ES
dc.rights	Reconocimiento (by)	es_ES
dc.subject	Scene understanding	es_ES
dc.subject	Large language models	es_ES
dc.subject	Visual language models	es_ES
dc.subject	CLIP	es_ES
dc.subject	GPT-3	es_ES
dc.subject	YOLOv7	es_ES
dc.subject	UAV	es_ES
dc.subject.classification	ARQUITECTURA Y TECNOLOGIA DE COMPUTADORES	es_ES
dc.title	Semantic Scene Understanding with Large Language Models on Unmanned Aerial Vehicles	es_ES
dc.type	Artículo	es_ES
dc.identifier.doi	10.3390/drones7020114	es_ES
dc.relation.projectID	info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/PID2021-122580NB-I00/ES/SISTEMAS INTELIGENTES DE SENSORIZACION PARA ECOSISTEMAS, ESPACIOS URBANOS Y MOVILIDAD SOSTENIBLE/	es_ES
dc.rights.accessRights	Abierto	es_ES
dc.contributor.affiliation	Universitat Politècnica de València. Escola Tècnica Superior d'Enginyeria Informàtica	es_ES
dc.description.bibliographicCitation	De Curtò, J.; De Zarzà, I.; Tavares De Araujo Cesariny Calafate, CM. (2023). Semantic Scene Understanding with Large Language Models on Unmanned Aerial Vehicles. Drones. 7(2). https://doi.org/10.3390/drones7020114	es_ES
dc.description.accrualMethod	S	es_ES
dc.relation.publisherversion	https://doi.org/10.3390/drones7020114	es_ES
dc.type.version	info:eu-repo/semantics/publishedVersion	es_ES
dc.description.volume	7	es_ES
dc.description.issue	2	es_ES
dc.identifier.eissn	2504-446X	es_ES
dc.relation.pasarela	S\482432	es_ES
dc.contributor.funder	AGENCIA ESTATAL DE INVESTIGACION	es_ES
dc.contributor.funder	European Regional Development Fund	es_ES
dc.contributor.funder	Universitat Politècnica de València	es_ES

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo del ítem

Semantic Scene Understanding with Large Language Models on Unmanned Aerial Vehicles

RiuNet: Repositorio Institucional de la Universidad Politécnica de Valencia

Buscar en RiuNet

Listar

Todo RiuNet

Esta colección

Mi cuenta

Estadísticas

Ayuda RiuNet

Admin. UPV

Compartir/Enviar a

Citas

Estadísticas

Semantic Scene Understanding with Large Language Models on Unmanned Aerial Vehicles

Ficheros en el ítem

Este ítem aparece en la(s) siguiente(s) colección(ones)