May 24, 2024, 4:52 a.m. | Feng Wang, Jiahao Wang, Sucheng Ren, Guoyizhe Wei, Jieru Mei, Wei Shao, Yuyin Zhou, Alan Yuille, Cihang Xie

cs.CV updates on arXiv.org arxiv.org

arXiv:2405.14858v1 Announce Type: new
Abstract: Similar to Vision Transformers, this paper identifies artifacts also present within the feature maps of Vision Mamba. These artifacts, corresponding to high-norm tokens emerging in low-information background areas of images, appear much more severe in Vision Mamba -- they exist prevalently even with the tiny-sized model and activate extensively across background regions. To mitigate this issue, we follow the prior solution of introducing register tokens into Vision Mamba. To better cope with Mamba blocks' uni-directional …

abstract arxiv cs.cv feature images information low mamba maps norm paper tokens transformers type vision vision transformers

Senior Data Engineer

@ Displate | Warsaw

Werkstudent Product Data Management (m/w/d)

@ ABOUT YOU SE & Co. KG | Hamburg, Germany

Data Scientist

@ Meta | Sunnyvale, CA

Data Scientist, Analytics

@ Meta | Menlo Park, CA

Principal AI Engineer

@ Blankfactor | Romania - Bucharest

Data Engineer

@ DigiOutsource | Cape Town - Waterview Park