all AI news
Scattering Vision Transformer: Spectral Mixing Matters. (arXiv:2311.01310v1 [cs.CV])
cs.LG updates on arXiv.org arxiv.org
Vision transformers have gained significant attention and achieved
state-of-the-art performance in various computer vision tasks, including image
classification, instance segmentation, and object detection. However,
challenges remain in addressing attention complexity and effectively capturing
fine-grained information within images. Existing solutions often resort to
down-sampling operations, such as pooling, to reduce computational cost.
Unfortunately, such operations are non-invertible and can result in information
loss. In this paper, we present a novel approach called Scattering Vision
Transformer (SVT) to tackle these challenges. SVT …
art arxiv attention challenges classification complexity computational computer computer vision cost detection fine-grained image images information instance operations performance pooling reduce sampling segmentation solutions state tasks transformer transformers vision vision transformers