Efficient Vision-Language Pretraining with Visual Concepts and Hierarchical Alignment. (arXiv:2208.13628v2 [cs.CV] UPDATED) | allainews.com

Oct. 6, 2022, 1:16 a.m. | Mustafa Shukor, Guillaume Couairon, Matthieu Cord

cs.CV updates on arXiv.org arxiv.org

Vision and Language Pretraining has become the prevalent approach for
tackling multimodal downstream tasks. The current trend is to move towards ever
larger models and pretraining datasets. This computational headlong rush does
not seem reasonable in the long term to move toward sustainable solutions, and
de facto excludes academic laboratories with limited resources. In this work,
we propose a new framework, dubbed ViCHA, that efficiently exploits the input
data to boost the learning by: (a) a new hierarchical cross-modal alignment …

alignment arxiv hierarchical language vision

More from arxiv.org / cs.CV updates on arXiv.org

Attention-Map Augmentation for Hypercomplex Breast Cancer Classification 11 hours ago | arxiv.org

arxiv attention augmentation cancer +5

Hidden Flaws Behind Expert-Level Accuracy of GPT-4 Vision in Medicine 11 hours ago | arxiv.org

abstract accuracy analysis arxiv +26

A Survey on Autonomous Driving Datasets: Statistics, Annotation Quality, and a Future Outlook 11 hours ago | arxiv.org

abstract advances algorithms annotation +20

Towards Effective Multi-Moving-Camera Tracking: A New Dataset and Lightweight Link Model 11 hours ago | arxiv.org

arxiv cs.cv dataset moving +2

Holodeck: Language Guided Generation of 3D Embodied AI Environments 11 hours ago | arxiv.org

abstract arxiv cs.ai cs.cl +12

Weakly Supervised 3D Object Detection via Multi-Level Visual Guidance 11 hours ago | arxiv.org

3d object 3d object detection arxiv cs.cv +6

Fine-tuning vision foundation model for crack segmentation in civil infrastructures 11 hours ago | arxiv.org

abstract adapter ai models arxiv +15

FG-MDM: Towards Zero-Shot Human Motion Generation via Fine-Grained Descriptions 11 hours ago | arxiv.org

abstract arxiv beyond cs.cv +16

X-Adapter: Adding Universal Compatibility of Plugins for Upgraded Diffusion Model 11 hours ago | arxiv.org

abstract adapter arxiv control +20

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Senior Engineer - Data Science Operations

@ causaLens | London - Hybrid, England, United Kingdom

View on ai-jobs.net

F0138 - LLM Developer (AI NLP)

@ Ubiquiti Inc. | Taipei

View on ai-jobs.net

Staff Engineer, Database

@ Nagarro | Gurugram, India

View on ai-jobs.net

Artificial Intelligence Assurance Analyst

@ Booz Allen Hamilton | USA, VA, McLean (8251 Greensboro Dr)

View on ai-jobs.net