March 19, 2024, 4:44 a.m. | Yihao Xue, Siddharth Joshi, Dang Nguyen, Baharan Mirzasoleiman

cs.LG updates on arXiv.org arxiv.org

arXiv:2310.04971v2 Announce Type: replace
Abstract: Recently, multimodal contrastive learning (MMCL) approaches, such as CLIP, have achieved a remarkable success in learning representations that are robust against distribution shift and generalize to new domains. Despite the empirical success, the mechanism behind learning such generalizable representations is not understood. In this work, we rigorously analyze this problem and uncover two mechanisms behind MMCL's robustness: \emph{intra-class contrasting}, which allows the model to learn features with a high variance, and \emph{inter-class feature sharing}, where …

abstract arxiv clip cs.lg distribution domains modal multi-modal multimodal robust robustness shift success type understanding work

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Principal Applied Scientist

@ Microsoft | Redmond, Washington, United States

Data Analyst / Action Officer

@ OASYS, INC. | OASYS, INC., Pratt Avenue Northwest, Huntsville, AL, United States