all AI news
Towards Natural Language-Guided Drones: GeoText-1652 Benchmark with Spatial Relation Matching
March 15, 2024, 4:46 a.m. | Meng Chu, Zhedong Zheng, Wei Ji, Tingyu Wang, Tat-Seng Chua
cs.CV updates on arXiv.org arxiv.org
Abstract: Navigating drones through natural language commands remains challenging due to the dearth of accessible multi-modal datasets and the stringent precision requirements for aligning visual and textual data. To address this pressing need, we introduce GeoText-1652, a new natural language-guided geo-localization benchmark. This dataset is systematically constructed through an interactive human-computer process leveraging Large Language Model (LLM) driven annotation techniques in conjunction with pre-trained vision models. GeoText-1652 extends the established University-1652 image dataset with spatial-aware text …
abstract arxiv benchmark cs.cv cs.mm data dataset datasets drones geo language localization modal multi-modal natural natural language precision requirements spatial textual through type visual
More from arxiv.org / cs.CV updates on arXiv.org
Jobs in AI, ML, Big Data
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US
Research Engineer
@ Allora Labs | Remote
Ecosystem Manager
@ Allora Labs | Remote
Founding AI Engineer, Agents
@ Occam AI | New York