Web: http://arxiv.org/abs/2201.10728

Jan. 27, 2022, 2:10 a.m. | Yun-Hao Cao, Hao Yu, Jianxin Wu

cs.CV updates on arXiv.org arxiv.org

Vision Transformers (ViTs) is emerging as an alternative to convolutional
neural networks (CNNs) for visual recognition. They achieve competitive results
with CNNs but the lack of the typical convolutional inductive bias makes them
more data-hungry than common CNNs. They are often pretrained on JFT-300M or at
least ImageNet and few works study training ViTs with limited data. In this
paper, we investigate how to train ViTs with limited data (e.g., 2040 images).
We give theoretical analyses that our method (based …

arxiv cv images training transformers vision

More from arxiv.org / cs.CV updates on arXiv.org

Machine Learning Product Manager (Europe, Remote)

@ FreshBooks | Germany

Field Operations and Data Engineer, ADAS

@ Lucid Motors | Newark, CA

Machine Learning Engineer - Senior

@ Novetta | Reston, VA

Analytics Engineer

@ ThirdLove | Remote

Senior Machine Learning Infrastructure Engineer - Safety

@ Discord | San Francisco, CA or Remote

Internship, Data Scientist

@ Everstream Analytics | United States (Remote)