Feb. 8, 2024, 5:47 a.m. | Guoqiang Liang Jiahao Hu Qingyue Wang Shizhou Zhang

cs.CV updates on arXiv.org arxiv.org

Human de-occlusion, which aims to infer the appearance of invisible human parts from an occluded image, has great value in many human-related tasks, such as person re-id, and intention inference. To address this task, this paper proposes a dynamic mask-aware transformer (DMAT), which dynamically augments information from human regions and weakens that from occlusion. First, to enhance token representation, we design an expanded convolution head with enlarged kernels, which captures more local valid context and mitigates the influence of surrounding …

