Web: http://arxiv.org/abs/2201.12086

Jan. 31, 2022, 2:10 a.m. | Junnan Li, Dongxu Li, Caiming Xiong, Steven Hoi

cs.CV updates on arXiv.org arxiv.org

Vision-Language Pre-training (VLP) has advanced the performance for many
vision-language tasks. However, most existing pre-trained models only excel in
either understanding-based tasks or generation-based tasks. Furthermore,
performance improvement has been largely achieved by scaling up the dataset
with noisy image-text pairs collected from the web, which is a suboptimal
source of supervision. In this paper, we propose BLIP, a new VLP framework
which transfers flexibly to both vision-language understanding and generation
tasks. BLIP effectively utilizes the noisy web data by …

arxiv bootstrapping cv language training vision

Director, Data Science (Advocacy & Nonprofit)

@ Civis Analytics | Remote

Data Engineer

@ Rappi | [CO] Bogotá

Data Scientist V, Marketplaces Personalization (Remote)

@ ID.me | United States (U.S.)

Product OPs Data Analyst (Flex/Remote)

@ Scaleway | Paris

Big Data Engineer

@ Risk Focus | Riga, Riga, Latvia

Internship Program: Machine Learning Backend

@ Nextail | Remote job