Web: http://arxiv.org/abs/2209.07738

Sept. 22, 2022, 1:15 a.m. | Zimian Wei, Hengyue Pan, Xin Niu, Dongsheng Li

cs.CV updates on arXiv.org arxiv.org

Vision transformers have shown excellent performance in computer vision
tasks. However, the computation cost of their (local) self-attention mechanism
is expensive. Comparatively, CNN is more efficient with built-in inductive
bias. Recent works show that CNN is promising to compete with vision
transformers by learning their architecture design and training protocols.
Nevertheless, existing methods either ignore multi-level features or lack
dynamic prosperity, leading to sub-optimal performance. In this paper, we
propose a novel attention mechanism named MCA, which captures different
patterns …

arxiv cnn gap transformers vision

More from arxiv.org / cs.CV updates on arXiv.org

Research Scientists

@ ODU Research Foundation | Norfolk, Virginia

Embedded Systems Engineer (Robotics)

@ Neo Cybernetica | Bedford, New Hampshire

2023 Luis J. Alvarez and Admiral Grace M. Hopper Postdoc Fellowship in Computing Sciences

@ Lawrence Berkeley National Lab | San Francisco, CA

Senior Manager Data Scientist

@ NAV | Remote, US

Senior AI Research Scientist

@ Earth Species Project | Remote anywhere

Research Fellow- Center for Security and Emerging Technology (Multiple Opportunities)

@ University of California Davis | Washington, DC

Staff Fellow - Data Scientist

@ U.S. FDA/Center for Devices and Radiological Health | Silver Spring, Maryland

Staff Fellow - Senior Data Engineer

@ U.S. FDA/Center for Devices and Radiological Health | Silver Spring, Maryland

Senior Research Engineer, Applied Language

@ DeepMind | Mountain View, California, US

Machine Learning Engineer

@ Bluevine | Austin, TX

Lead Manager - Analytics & Data Science

@ Tide | India(Remote)

Machine Learning Engineer

@ Gtmhub | Indore, Madhya Pradesh, India