all AI news
Efficient Vision-Language Pretraining with Visual Concepts and Hierarchical Alignment. (arXiv:2208.13628v2 [cs.CV] UPDATED)
Oct. 6, 2022, 1:16 a.m. | Mustafa Shukor, Guillaume Couairon, Matthieu Cord
cs.CV updates on arXiv.org arxiv.org
Vision and Language Pretraining has become the prevalent approach for
tackling multimodal downstream tasks. The current trend is to move towards ever
larger models and pretraining datasets. This computational headlong rush does
not seem reasonable in the long term to move toward sustainable solutions, and
de facto excludes academic laboratories with limited resources. In this work,
we propose a new framework, dubbed ViCHA, that efficiently exploits the input
data to boost the learning by: (a) a new hierarchical cross-modal alignment …
More from arxiv.org / cs.CV updates on arXiv.org
Jobs in AI, ML, Big Data
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Senior Engineer - Data Science Operations
@ causaLens | London - Hybrid, England, United Kingdom
F0138 - LLM Developer (AI NLP)
@ Ubiquiti Inc. | Taipei
Staff Engineer, Database
@ Nagarro | Gurugram, India
Artificial Intelligence Assurance Analyst
@ Booz Allen Hamilton | USA, VA, McLean (8251 Greensboro Dr)