March 28, 2024, 4:45 a.m. | Hanxiao Zhang, Yifan Zhou, Guo-Hua Wang, Jianxin Wu

Abstract: Few-shot model compression aims to compress a large model into a more compact one with only a tiny training set (even without labels). Block-level pruning has recently emerged as a leading technique in achieving high accuracy and low latency in few-shot CNN compression. But, few-shot compression for Vision Transformers (ViT) remains largely unexplored, which presents a new challenge. In particular, the issue of sparse compression exists in traditional CNN few-shot methods, which can only produce …

