March 15, 2024, 4:45 a.m. | Chaoyang Wang, Xiangtai Li, Henghui Ding, Lu Qi, Jiangning Zhang, Yunhai Tong, Chen Change Loy, Shuicheng Yan

cs.CV updates on arXiv.org arxiv.org

arXiv:2403.09616v1 Announce Type: new
Abstract: In-context segmentation has drawn more attention with the introduction of vision foundation models. Most existing approaches adopt metric learning or masked image modeling to build the correlation between visual prompts and input image queries. In this work, we explore this problem from a new perspective, using one representative generation model, the latent diffusion model (LDM). We observe a task gap between generation and segmentation in diffusion models, but LDM is still an effective minimalist for …

abstract arxiv attention build context correlation cs.cv diffusion diffusion models explore foundation image introduction latent diffusion models modeling perspective prompts queries segmentation type via vision visual work

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

AIML - Sr Machine Learning Engineer, Data and ML Innovation

@ Apple | Seattle, WA, United States

Senior Data Engineer

@ Palta | Palta Cyprus, Palta Warsaw, Palta remote