all AI news
Spectral Convolutional Transformer: Harmonizing Real vs. Complex Multi-View Spectral Operators for Vision Transformer
March 28, 2024, 4:42 a.m. | Badri N. Patro, Vinay P. Namboodiri, Vijay S. Agneeswaran
cs.LG updates on arXiv.org arxiv.org
Abstract: Transformers used in vision have been investigated through diverse architectures - ViT, PVT, and Swin. These have worked to improve the attention mechanism and make it more efficient. Differently, the need for including local information was felt, leading to incorporating convolutions in transformers such as CPVT and CvT. Global information is captured using a complex Fourier basis to achieve global token mixing through various methods, such as AFNO, GFNet, and Spectformer. We advocate combining three …
arxiv cs.ai cs.cl cs.cv cs.lg cs.mm operators transformer type view vision
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Intern Large Language Models Planning (f/m/x)
@ BMW Group | Munich, DE
Data Engineer Analytics
@ Meta | Menlo Park, CA | Remote, US