Mamba-R: Vision Mamba ALSO Needs Registers | allainews.com

May 24, 2024, 4:52 a.m. | Feng Wang, Jiahao Wang, Sucheng Ren, Guoyizhe Wei, Jieru Mei, Wei Shao, Yuyin Zhou, Alan Yuille, Cihang Xie

cs.CV updates on arXiv.org arxiv.org

arXiv:2405.14858v1 Announce Type: new
Abstract: Similar to Vision Transformers, this paper identifies artifacts also present within the feature maps of Vision Mamba. These artifacts, corresponding to high-norm tokens emerging in low-information background areas of images, appear much more severe in Vision Mamba -- they exist prevalently even with the tiny-sized model and activate extensively across background regions. To mitigate this issue, we follow the prior solution of introducing register tokens into Vision Mamba. To better cope with Mamba blocks' uni-directional …

abstract arxiv cs.cv feature images information low mamba maps norm paper tokens transformers type vision vision transformers

More from arxiv.org / cs.CV updates on arXiv.org

DIAS: A Dataset and Benchmark for Intracranial Artery Segmentation in DSA sequences 2 days, 18 hours ago | arxiv.org

arxiv benchmark cs.cv dataset +6

Benchmarking Pretrained Vision Embeddings for Near- and Duplicate Detection in Medical Images 2 days, 18 hours ago | arxiv.org

abstract arxiv benchmarking biases +20

MAFA: Managing False Negatives for Vision-Language Pre-training 2 days, 18 hours ago | arxiv.org

arxiv cs.ai cs.cv false +7

Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation 2 days, 18 hours ago | arxiv.org

abstract animate anyone animation arxiv +23

KNVQA: A Benchmark for evaluation knowledge-based VQA 2 days, 18 hours ago | arxiv.org

abstract accuracy arxiv benchmark +22

Optimization Efficient Open-World Visual Region Recognition 2 days, 18 hours ago | arxiv.org

abstract arxiv building capabilities +25

HyperFields: Towards Zero-Shot Generation of NeRFs from Text 2 days, 18 hours ago | arxiv.org

abstract arxiv cs.cv distillation +14

Multi-modal Learning with Missing Modality via Shared-Specific Feature Modelling 2 days, 18 hours ago | arxiv.org

arxiv cs.cv feature modal +5

A Generative Model for Digital Camera Noise Synthesis 2 days, 18 hours ago | arxiv.org

abstract arxiv cs.cv digital +14

Senior Data Engineer

@ Displate | Warsaw

View on ai-jobs.net

Werkstudent Product Data Management (m/w/d)

@ ABOUT YOU SE & Co. KG | Hamburg, Germany

View on ai-jobs.net

Data Scientist

@ Meta | Sunnyvale, CA

View on ai-jobs.net

Data Scientist, Analytics

@ Meta | Menlo Park, CA

View on ai-jobs.net

Principal AI Engineer

@ Blankfactor | Romania - Bucharest

View on ai-jobs.net

Data Engineer

@ DigiOutsource | Cape Town - Waterview Park

View on ai-jobs.net