all AI news
FM-ViT: Flexible Modal Vision Transformers for Face Anti-Spoofing. (arXiv:2305.03277v1 [cs.CV])
cs.CV updates on arXiv.org arxiv.org
The availability of handy multi-modal (i.e., RGB-D) sensors has brought about
a surge of face anti-spoofing research. However, the current multi-modal face
presentation attack detection (PAD) has two defects: (1) The framework based on
multi-modal fusion requires providing modalities consistent with the training
input, which seriously limits the deployment scenario. (2) The performance of
ConvNet-based model on high fidelity datasets is increasingly limited. In this
work, we present a pure transformer-based framework, dubbed the Flexible Modal
Vision Transformer (FM-ViT), for …
arxiv consistent defects deployment detection face framework fusion presentation research sensors training transformers vision vit