all AI news
On the Connection between Local Attention and Dynamic Depth-wise Convolution. (arXiv:2106.04263v5 [cs.CV] UPDATED)
Aug. 5, 2022, 1:12 a.m. | Qi Han, Zejia Fan, Qi Dai, Lei Sun, Ming-Ming Cheng, Jiaying Liu, Jingdong Wang
cs.CV updates on arXiv.org arxiv.org
Vision Transformer (ViT) attains state-of-the-art performance in visual
recognition, and the variant, Local Vision Transformer, makes further
improvements. The major component in Local Vision Transformer, local attention,
performs the attention separately over small local windows. We rephrase local
attention as a channel-wise locally-connected layer and analyze it from two
network regularization manners, sparse connectivity and weight sharing, as well
as weight computation. Sparse connectivity: there is no connection across
channels, and each position is connected to the positions within a …
More from arxiv.org / cs.CV updates on arXiv.org
Compact 3D Scene Representation via Self-Organizing Gaussian Grids
1 day, 17 hours ago |
arxiv.org
Fingerprint Matching with Localized Deep Representation
1 day, 17 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne