Fine-Grained Semantically Aligned Vision-Language Pre-Training. (arXiv:2208.02515v2 [cs.CV] UPDATED) | allainews.com

Sept. 20, 2022, 1:13 a.m. | Juncheng Li, Xin He, Longhui Wei, Long Qian, Linchao Zhu, Lingxi Xie, Yueting Zhuang, Qi Tian, Siliang Tang

cs.CV updates on arXiv.org arxiv.org

Large-scale vision-language pre-training has shown impressive advances in a
wide range of downstream tasks. Existing methods mainly model the cross-modal
alignment by the similarity of the global representations of images and texts,
or advanced cross-modal attention upon image and text features. However, they
fail to explicitly learn the fine-grained semantic alignment between visual
regions and textual phrases, as only global image-text alignment information is
available. In this paper, we introduce LOUPE, a fine-grained semantically
aLigned visiOn-langUage PrE-training framework, which learns …

arxiv fine-grained language pre-training training vision

More from arxiv.org / cs.CV updates on arXiv.org

Image Restoration by Denoising Diffusion Models with Iteratively Preconditioned Guidance 10 hours ago | arxiv.org

abstract algorithms arxiv become +17

ParamISP: Learned Forward and Inverse ISPs using Camera Parameters 10 hours ago | arxiv.org

abstract arxiv cs.cv data +11

LLM-driven Multimodal Target Volume Contouring in Radiation Oncology 10 hours ago | arxiv.org

abstract advancement arxiv clinical +20

Dynamic Clue Bottlenecks: Towards Interpretable-by-Design Visual Question Answering 10 hours ago | arxiv.org

abstract advances arxiv bottlenecks +24

Multi-scale Attention Network for Single Image Super-Resolution 10 hours ago | arxiv.org

abstract arxiv attention cs.cv +10

Post-processing of coronary and myocardial spatial data 10 hours ago | arxiv.org

abstract arxiv computational context +17

RSBuilding: Towards General Remote Sensing Image Building Extraction and Change Detection with Foundation Model 10 hours ago | arxiv.org

abstract analysis arxiv building +24

Inconsistency Masks: Removing the Uncertainty from Input-Pseudo-Label Pairs 10 hours ago | arxiv.org

arxiv cs.cv masks type +1

Objects With Lighting: A Real-World Dataset for Evaluating Reconstruction and Rendering for Object Relighting 10 hours ago | arxiv.org

arxiv cs.cv cs.gr dataset +6

Data Scientist (m/f/x/d)

@ Symanto Research GmbH & Co. KG | Spain, Germany

View on ai-jobs.net

Associate Data Engineer

@ Redkite | London, England, United Kingdom

View on ai-jobs.net

Data Management Associate Consultant

@ SAP | Porto Salvo, PT, 2740-262

View on ai-jobs.net

NLP & Data Modelling Consultant - SAP LABS

@ SAP | Bengaluru, IN, 560066

View on ai-jobs.net

Catalog Data Quality Specialist

@ Delivery Hero | Montevideo, Uruguay

View on ai-jobs.net

Data Analyst for CEO Office with Pathway to Functional Analyst

@ Amar Bank | Jakarta

View on ai-jobs.net