Large Language Models for Captioning and Retrieving Remote Sensing Images | allainews.com

Feb. 12, 2024, 5:45 a.m. | Jo\~ao Daniel Silva Jo\~ao Magalh\~aes Devis Tuia Bruno Martins

cs.CV updates on arXiv.org arxiv.org

Image captioning and cross-modal retrieval are examples of tasks that involve the joint analysis of visual and linguistic information. In connection to remote sensing imagery, these tasks can help non-expert users in extracting relevant Earth observation information for a variety of applications. Still, despite some previous efforts, the development and application of vision and language models to the remote sensing domain have been hindered by the relatively small size of the available datasets and models used in previous studies. In …

analysis applications captioning cs.cv development earth earth observation examples expert image images information language language models large language large language models modal observation retrieval sensing tasks visual

More from arxiv.org / cs.CV updates on arXiv.org

Physics-Informed Computer Vision: A Review and Perspectives 3 hours ago | arxiv.org

abstract application arxiv computer +26

Boosting Visual Recognition in Real-world Degradations via Unsupervised Feature Enhancement Module with Deep Channel Prior 3 hours ago | arxiv.org

arxiv boosting cs.cv feature +8

Analyzing and Mitigating Bias for Vulnerable Classes: Towards Balanced Representation in Dataset 3 hours ago | arxiv.org

abstract accuracy arxiv autonomous +23

GPT4Ego: Unleashing the Potential of Pre-trained Models for Zero-Shot Egocentric Action Recognition 3 hours ago | arxiv.org

abstract action recognition advancement arxiv +23

Revisiting Sampson Approximations for Geometric Estimation Problems 3 hours ago | arxiv.org

abstract arxiv collection computer +8

Frequency-Time Diffusion with Neural Cellular Automata 3 hours ago | arxiv.org

abstract arxiv capabilities cellular +16

A Comprehensive Overview of Fish-Eye Camera Distortion Correction Methods 3 hours ago | arxiv.org

abstract applications arxiv cameras +13

Adaptive Depth Networks with Skippable Sub-Paths 3 hours ago | arxiv.org

abstract arxiv control cs.ai +11

Attention-aware Social Graph Transformer Networks for Stochastic Trajectory Prediction 3 hours ago | arxiv.org

abstract arxiv attention autonomous +26

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

View on ai-jobs.net

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net