Web: http://arxiv.org/abs/2206.07706

June 16, 2022, 1:11 a.m. | Jiahao Xie, Wei Li, Xiaohang Zhan, Ziwei Liu, Yew Soon Ong, Chen Change Loy

cs.LG updates on arXiv.org arxiv.org

We present Masked Frequency Modeling (MFM), a unified frequency-domain-based
approach for self-supervised pre-training of visual models. Instead of randomly
inserting mask tokens to the input embeddings in the spatial domain, in this
paper, we shift the perspective to the frequency domain. Specifically, MFM
first masks out a portion of frequency components of the input image and then
predicts the missing frequencies on the frequency spectrum. Our key insight is
that predicting masked components in the frequency domain is more ideal …

arxiv cv modeling pre-training training

