Web: http://arxiv.org/abs/2206.07990

June 20, 2022, 1:11 a.m. | Sukmin Yun, Hankook Lee, Jaehyung Kim, Jinwoo Shin

cs.LG updates on arXiv.org arxiv.org

Recent self-supervised learning (SSL) methods have shown impressive results
in learning visual representations from unlabeled images. This paper aims to
improve their performance further by utilizing the architectural advantages of
the underlying neural network, as the current state-of-the-art visual pretext
tasks for SSL do not enjoy the benefit, i.e., they are architecture-agnostic.
In particular, we focus on Vision Transformers (ViTs), which have gained much
attention recently as a better architectural choice, often outperforming
convolutional networks for various visual tasks. The …

arxiv cv learning representation representation learning transformers vision

More from arxiv.org / cs.LG updates on arXiv.org

Machine Learning Researcher - Saalfeld Lab

@ Howard Hughes Medical Institute - Chevy Chase, MD | Ashburn, Virginia

Project Director, Machine Learning in US Health

@ ideas42.org | Remote, US

Data Science Intern

@ NannyML | Remote

Machine Learning Engineer NLP/Speech

@ Play.ht | Remote

Research Scientist, 3D Reconstruction

@ Yembo | Remote, US

Clinical Assistant or Associate Professor of Management Science and Systems

@ University at Buffalo | Buffalo, NY