April 2, 2024, 7:47 p.m. | Qin Liu, Jaemin Cho, Mohit Bansal, Marc Niethammer

arXiv:2404.00741v1 Announce Type: new
Abstract: The goal of interactive image segmentation is to delineate specific regions within an image via visual or language prompts. Low-latency and high-quality interactive segmentation with diverse prompts remain challenging for existing specialist and generalist models. Specialist models, with their limited prompts and task-specific designs, experience high latency because the image must be recomputed every time the prompt is updated, due to the joint encoding of image and visual prompts. Generalist models, exemplified by the Segment …

