MAFA: Managing False Negatives for Vision-Language Pre-training | allainews.com

June 14, 2024, 4:48 a.m. | Jaeseok Byun, Dohoon Kim, Taesup Moon

cs.CV updates on arXiv.org arxiv.org

arXiv:2312.06112v2 Announce Type: replace
Abstract: We consider a critical issue of false negatives in Vision-Language Pre-training (VLP), a challenge that arises from the inherent many-to-many correspondence of image-text pairs in large-scale web-crawled datasets. The presence of false negatives can impede achieving optimal performance and even lead to a significant performance drop. To address this challenge, we propose MAFA (MAnaging FAlse negatives), which consists of two pivotal components building upon the recently developed GRouped mIni-baTch sampling (GRIT) strategy: 1) an efficient …

arxiv cs.ai cs.cv false language pre-training replace training type vision vision-language

More from arxiv.org / cs.CV updates on arXiv.org

DEFN: Dual-Encoder Fourier Group Harmonics Network for Three-Dimensional Indistinct-Boundary Object Segmentation 15 hours ago | arxiv.org

arxiv cs.cv eess.iv encoder +7

MedSyn: Text-guided Anatomy-aware Synthesis of High-Fidelity 3D CT Images 15 hours ago | arxiv.org

abstract art arxiv cs.cv +25

Automated Evaluation of Large Vision-Language Models on Self-driving Corner Cases 15 hours ago | arxiv.org

abstract arxiv assessment attention +17

MISS: A Generative Pretraining and Finetuning Approach for Med-VQA 15 hours ago | arxiv.org

abstract application arxiv classification +23

ChartBench: A Benchmark for Complex Visual Reasoning in Charts 15 hours ago | arxiv.org

arxiv benchmark charts cs.cv +4

Jack of All Tasks, Master of Many: Designing General-purpose Coarse-to-Fine Vision-Language Model 15 hours ago | arxiv.org

arxiv cs.ai cs.cv designing +10

Deciphering 'What' and 'Where' Visual Pathways from Spectral Clustering of Layer-Distributed Neural Representations 15 hours ago | arxiv.org

abstract analysis arxiv behavior +19

VLPrompt: Vision-Language Prompting for Panoptic Scene Graph Generation 15 hours ago | arxiv.org

arxiv cs.cv graph language +5

High-Resolution Building and Road Detection from Sentinel-2 15 hours ago | arxiv.org

abstract arxiv building buildings +15

Senior Data Engineer

@ Displate | Warsaw

View on ai-jobs.net

Senior Principal Software Engineer

@ Oracle | Columbia, MD, United States

View on ai-jobs.net

Software Engineer for Manta Systems

@ PXGEO | Linköping, Östergötland County, Sweden

View on ai-jobs.net

DevOps Engineer

@ Teradyne | Odense, DK

View on ai-jobs.net

LIDAR System Engineer Trainee

@ Valeo | PRAGUE - PRA2

View on ai-jobs.net

Business Applications Administrator

@ Allegro | Poznań, Poland

View on ai-jobs.net