May 24, 2024, 4:52 a.m. | Feng Wang, Jiahao Wang, Sucheng Ren, Guoyizhe Wei, Jieru Mei, Wei Shao, Yuyin Zhou, Alan Yuille, Cihang Xie

cs.CV updates on arXiv.org arxiv.org

arXiv:2405.14858v1 Announce Type: new
Abstract: Similar to Vision Transformers, this paper identifies artifacts also present within the feature maps of Vision Mamba. These artifacts, corresponding to high-norm tokens emerging in low-information background areas of images, appear much more severe in Vision Mamba -- they exist prevalently even with the tiny-sized model and activate extensively across background regions. To mitigate this issue, we follow the prior solution of introducing register tokens into Vision Mamba. To better cope with Mamba blocks' uni-directional …

abstract arxiv cs.cv feature images information low mamba maps norm paper tokens transformers type vision vision transformers

AI Focused Biochemistry Postdoctoral Fellow

@ Lawrence Berkeley National Lab | Berkeley, CA

Senior Data Engineer

@ Displate | Warsaw

Associate Director, IT Business Partner, Cell Therapy Analytical Development

@ Bristol Myers Squibb | Warren - NJ

Solutions Architect

@ Lloyds Banking Group | London 125 London Wall

Senior Lead Cloud Engineer

@ S&P Global | IN - HYDERABAD ORION

Software Engineer

@ Applied Materials | Bengaluru,IND