April 11, 2024, 4:45 a.m. | Haojie Zhang, Yongyi Su, Xun Xu, Kui Jia

cs.CV updates on arXiv.org arxiv.org

arXiv:2312.03502v2 Announce Type: replace
Abstract: The success of large language models has inspired the computer vision community to explore image segmentation foundation model that is able to zero/few-shot generalize through prompt engineering. Segment-Anything(SAM), among others, is the state-of-the-art image segmentation foundation model demonstrating strong zero/few-shot generalization. Despite the success, recent studies reveal the weakness of SAM under strong distribution shift. In particular, SAM performs awkwardly on corrupted natural images, camouflaged images, medical images, etc. Motivated by the observations, we aim …

abstract art arxiv community computer computer vision cs.cv distribution engineering explore few-shot foundation foundation model image improving language language models large language large language models prompt sam segment segmentation shift state success through type via vision

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne