all AI news
SP-ViT: Learning 2D Spatial Priors for Vision Transformers. (arXiv:2206.07662v1 [cs.CV])
Web: http://arxiv.org/abs/2206.07662
June 16, 2022, 1:13 a.m. | Yuxuan Zhou, Wangmeng Xiang, Chao Li, Biao Wang, Xihan Wei, Lei Zhang, Margret Keuper, Xiansheng Hua
cs.CV updates on arXiv.org arxiv.org
Recently, transformers have shown great potential in image classification and
established state-of-the-art results on the ImageNet benchmark. However,
compared to CNNs, transformers converge slowly and are prone to overfitting in
low-data regimes due to the lack of spatial inductive biases. Such spatial
inductive biases can be especially beneficial since the 2D structure of an
input image is not well preserved in transformers. In this work, we present
Spatial Prior-enhanced Self-Attention (SP-SA), a novel variant of vanilla
Self-Attention (SA) tailored for …
More from arxiv.org / cs.CV updates on arXiv.org
Latest AI/ML/Big Data Jobs
Machine Learning Researcher - Saalfeld Lab
@ Howard Hughes Medical Institute - Chevy Chase, MD | Ashburn, Virginia
Project Director, Machine Learning in US Health
@ ideas42.org | Remote, US
Data Science Intern
@ NannyML | Remote
Machine Learning Engineer NLP/Speech
@ Play.ht | Remote
Research Scientist, 3D Reconstruction
@ Yembo | Remote, US
Clinical Assistant or Associate Professor of Management Science and Systems
@ University at Buffalo | Buffalo, NY