all AI news
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation. (arXiv:2201.12086v1 [cs.CV])
Web: http://arxiv.org/abs/2201.12086
Jan. 31, 2022, 2:10 a.m. | Junnan Li, Dongxu Li, Caiming Xiong, Steven Hoi
cs.CV updates on arXiv.org arxiv.org
Vision-Language Pre-training (VLP) has advanced the performance for many
vision-language tasks. However, most existing pre-trained models only excel in
either understanding-based tasks or generation-based tasks. Furthermore,
performance improvement has been largely achieved by scaling up the dataset
with noisy image-text pairs collected from the web, which is a suboptimal
source of supervision. In this paper, we propose BLIP, a new VLP framework
which transfers flexibly to both vision-language understanding and generation
tasks. BLIP effectively utilizes the noisy web data by …
More from arxiv.org / cs.CV updates on arXiv.org
Latest AI/ML/Big Data Jobs
Director, Data Science (Advocacy & Nonprofit)
@ Civis Analytics | Remote
Data Engineer
@ Rappi | [CO] Bogotá
Data Scientist V, Marketplaces Personalization (Remote)
@ ID.me | United States (U.S.)
Product OPs Data Analyst (Flex/Remote)
@ Scaleway | Paris
Big Data Engineer
@ Risk Focus | Riga, Riga, Latvia
Internship Program: Machine Learning Backend
@ Nextail | Remote job