April 7, 2022, 1:10 a.m. | Qiang Chen, Qiman Wu, Jian Wang, Qinghao Hu, Tao Hu, Errui Ding, Jian Cheng, Jingdong Wang

cs.CV updates on arXiv.org arxiv.org

While local-window self-attention performs notably in vision tasks, it
suffers from limited receptive field and weak modeling capability issues. This
is mainly because it performs self-attention within non-overlapped windows and
shares weights on the channel dimension. We propose MixFormer to find a
solution. First, we combine local-window self-attention with depth-wise
convolution in a parallel design, modeling cross-window connections to enlarge
the receptive fields. Second, we propose bi-directional interactions across
branches to provide complementary clues in the channel and spatial dimensions. …

arxiv cv features windows

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Business Intelligence Analyst

@ Rappi | COL-Bogotá

Applied Scientist II

@ Microsoft | Redmond, Washington, United States