July 25, 2022, 1:13 a.m. | Ioannis Kakogeorgiou, Spyros Gidaris, Bill Psomas, Yannis Avrithis, Andrei Bursuc, Konstantinos Karantzalos, Nikos Komodakis

cs.CV updates on arXiv.org arxiv.org

Transformers and masked language modeling are quickly being adopted and
explored in computer vision as vision transformers and masked image modeling
(MIM). In this work, we argue that image token masking differs from token
masking in text, due to the amount and correlation of tokens in an image. In
particular, to generate a challenging pretext task for MIM, we advocate a shift
from random masking to informed masking. We develop and exhibit this idea in
the context of distillation-based MIM, …

arxiv attention cv hide image modeling students

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne