Web: http://arxiv.org/abs/2111.07832

Jan. 28, 2022, 2:10 a.m. | Jinghao Zhou, Chen Wei, Huiyu Wang, Wei Shen, Cihang Xie, Alan Yuille, Tao Kong

cs.CV updates on arXiv.org arxiv.org

The success of language Transformers is primarily attributed to the pretext
task of masked language modeling (MLM), where texts are first tokenized into
semantically meaningful pieces. In this work, we study masked image modeling
(MIM) and indicate the advantages and challenges of using a semantically
meaningful visual tokenizer. We present a self-supervised framework iBOT that
can perform masked prediction with an online tokenizer. Specifically, we
perform self-distillation on masked patch tokens and take the teacher network
as the online tokenizer, …

arxiv bert cv online training

More from arxiv.org / cs.CV updates on arXiv.org

Senior Data Engineer

@ DAZN | Hammersmith, London, United Kingdom

Sr. Data Engineer, Growth

@ Netflix | Remote, United States

Data Engineer - Remote

@ Craft | Wrocław, Lower Silesian Voivodeship, Poland

Manager, Operations Data Science

@ Binance.US | Vancouver

Senior Machine Learning Researcher for Copilot

@ GitHub | Remote - Europe

Sr. Marketing Data Analyst

@ HoneyBook | San Francisco, CA